capeprivacy / tf-trusted

tf-trusted allows you to run TensorFlow models in secure enclaves
https://capeprivacy.com/
Apache License 2.0
87 stars 11 forks source link

ModuleNotFoundError: No module named 'tf_trusted_custom_op' #22

Open tgamal opened 5 years ago

tgamal commented 5 years ago

I followed the steps for installing tf_trusted_custom_op and it builds successfully but when I try to run model_run.py , I get the error ModuleNotFoundError: No module named 'tf_trusted_custom_op'

`root@4811bb1b5421:/opt/my-project/tf_trusted_custom_op# bazel build model_enclave_op.so WARNING: --batch mode is deprecated. Please instead explicitly shut down your Bazel server using the command "bazel shutdown". WARNING: /root/.cache/bazel/_bazel_root/6a072cedc59c5d9384722d447b964014/external/local_config_tf/BUILD:3588:1: target 'libtensorflow_framework.so' is both a rule and a file; please choose another name for the rule INFO: SHA256 (https://github.com/nanopb/nanopb/archive/f8ac463766281625ad710900479130c7fcb4d63b.tar.gz) = 8bbbb1e78d4ddb0a1919276924ab10d11b631df48b657d960e0c795a25515735 DEBUG: /root/.cache/bazel/_bazel_root/6a072cedc59c5d9384722d447b964014/external/bazel_tools/tools/build_defs/repo/http.bzl:43:9: ctx.attr.build_file @com_github_grpc_grpc//third_party:nanopb.BUILD, path /root/.cache/bazel/_bazel_root/6a072cedc59c5d9384722d447b964014/external/com_github_grpc_grpc/third_party/nanopb.BUILD INFO: SHA256 (https://github.com/c-ares/c-ares/archive/3be1924221e1326df520f8498d704a5c4c8d0cce.tar.gz) = e69e33fd40a254fcf00d76efa76776d45f960e34307bd9cea9df93ef79a933f1 DEBUG: /root/.cache/bazel/_bazel_root/6a072cedc59c5d9384722d447b964014/external/bazel_tools/tools/build_defs/repo/http.bzl:43:9: ctx.attr.build_file @com_github_grpc_grpc//third_party:cares/cares.BUILD, path /root/.cache/bazel/_bazel_root/6a072cedc59c5d9384722d447b964014/external/com_github_grpc_grpc/third_party/cares/cares.BUILD INFO: SHA256 (https://github.com/madler/zlib/archive/cacf7f1d4e3d44d871b605da3b647f07d718623f.tar.gz) = 6d4d6640ca3121620995ee255945161821218752b551a1a180f4215f7d124d45 DEBUG: /root/.cache/bazel/_bazel_root/6a072cedc59c5d9384722d447b964014/external/bazel_tools/tools/build_defs/repo/http.bzl:43:9: ctx.attr.build_file @com_github_grpc_grpc//third_party:zlib.BUILD, path /root/.cache/bazel/_bazel_root/6a072cedc59c5d9384722d447b964014/external/com_github_grpc_grpc/third_party/zlib.BUILD INFO: SHA256 (https://boringssl.googlesource.com/boringssl/+archive/afc30d43eef92979b05776ec0963c9cede5fb80f.tar.gz) = d01d090d4a849f6b124651a2e48ea5766f3a155403ccad14f9fd92ffdd87d2d8 WARNING: /root/.cache/bazel/_bazel_root/6a072cedc59c5d9384722d447b964014/external/local_config_tf/BUILD:5:12: in hdrs attribute of cc_library rule @local_config_tf//:tf_header_lib: file '_api_implementation.so' from target '@local_config_tf//:tf_header_include' is not allowed in hdrs WARNING: /root/.cache/bazel/_bazel_root/6a072cedc59c5d9384722d447b964014/external/local_config_tf/BUILD:5:12: in hdrs attribute of cc_library rule @local_config_tf//:tf_header_lib: file '_message.so' from target '@local_config_tf//:tf_header_include' is not allowed in hdrs INFO: Analysed target //:model_enclave_op.so (21 packages loaded). INFO: Found 1 target... INFO: From SkylarkAction external/com_github_grpc_grpc/src/proto/grpc/reflection/v1alpha/reflection.pb.h: bazel-out/k8-fastbuild/genfiles/external/com_github_grpc_grpc/external/com_github_grpc_grpc: warning: directory does not exist. INFO: From SkylarkAction external/com_github_grpc_grpc/src/proto/grpc/reflection/v1alpha/reflection.grpc.pb.h: bazel-out/k8-fastbuild/genfiles/external/com_github_grpc_grpc/external/com_github_grpc_grpc: warning: directory does not exist. Target //:model_enclave_op.so up-to-date: bazel-bin/model_enclave_op.so INFO: Elapsed time: 65.391s, Critical Path: 8.76s INFO: 1090 processes: 1090 local.

justin1121 commented 5 years ago

Hey there,

Try running model_run.py from the root directory. Like:

python tf_trusted_custom_op/model_run.py <args>
tgamal commented 5 years ago

Thanks Justin for the prompt reply, unfortunately I still have the same error

python3 tf_trusted_custom_op/model_run.py --model_file ~/inception5h/tensorflow_inception_graph.pb --input_file ~/data_npy/3892.npy --input_name input --output_name output Traceback (most recent call last): File "tf_trusted_custom_op/model_run.py", line 7, in import tf_trusted_custom_op as tft ModuleNotFoundError: No module named 'tf_trusted_custom_op'

Does it matter if I run it inside the tf_trusted_custom_op container or in the host machine ? I am running it in the host machine

justin1121 commented 5 years ago

If the host machine is a linux box then there's a good chance that it should still work. I'm not sure this is problem though. I'd give that a try to start though.

tgamal commented 5 years ago

so I think since the build completes successfully, but my installed python version cannot see the new custom op, is there a way I should copy this file "model_enclave_op.so" somewhere else so it can be seen by my installed python version. I am not sure how python imports the custom_op

justin1121 commented 5 years ago

Actually I think its just a problem finding the module. From the root directory try running pip install -e . and running model_run.py again.

tgamal commented 5 years ago

Thanks @justin1121 the custom_op is now installed but I have another issue. When I try to run model_run.py, it gives me the following error

bazel-out/k8-fastbuild/genfiles/external/local_configtf/include/tensorflow/core/lib/core/refcount.h:90] Check failed: ref.load() == 0 (1 vs. 0)

My environment is ubuntu 18.04, python 3.6.8, tensorflow 1.13.1

I tried to search for solution, and the proposed workaround here (https://github.com/tensorflow/tensorflow/issues/17316) is to add the flag -DNDEBUG in the compiler_flags of the custom_op, I tried to append this flag to the TF_CFLAGS in tf_trusted_custom_op/configure.sh but It did not work for me ? Please advise if you have faced that error before

justin1121 commented 5 years ago

I've run into this issue in the past before but its happened to me sporadically and never found a good solution. Seems to me some form of the issue you found would be a good way to solve. I don't think bazel looking at the TF_CFLAGS so I'd try adding it to the BUILD file https://github.com/dropoutlabs/tf-trusted/blob/master/tf_trusted_custom_op/BUILD#L31. Like:

    copts = ["-pthread", "-std=c++11", "-D_GLIBCXX_USE_CXX11_ABI=0", -DNDEBUG]

Let me know if that works. Also if you feel up to it it'd be great if you could submit documentation changes and whatever other changes you end up creating. Thanks!

tgamal commented 5 years ago

Hi justin, thanks alot, it works now after I did this modification. What would be the best way to update the documentation with this workaround. Should I submit a change to the README file of the custom_op repo ?

justin1121 commented 5 years ago

Hey, you can update the BUILD file directly with the -DNDEBUG flag like above and then add something to the README.md about making sure tf_trusted_custom_op is installed before running the model_run.py script.