Closed humzaiqbal closed 3 years ago
Update: After debugging some more I got to a point in the TFEngine.class
where the code was instantiating an instance of the TFEngine
by calling the newInstance()
method and when it was running EagerSession.getDefault()
I got this error
java.lang.ClassNotFoundException: org.tensorflow.internal.c_api
Does this mean that the org.tensorflow library is required in order to use tensorflow in djl? My understanding was that as long as I include the pom dependencies I should be able to run it just fine (this was the case when I ran djl with PyTorch although I did have PyTorch installed on my machine)
Hi @humzaiqbal, thanks for raising the issue! There are a few things to try:
Add "ai.djl.tensorflow:tensorflow-api:0.6.0"
as well, although it should be automatically included as tensorflow-engine
depends on it. the dependency is ai.djl.tensorflow:tensorflow-engine
->ai.djl.tensorflow:tensorflow-api
-> org.tensorflow:tensorflow-core-api
If you are using IDE, sometimes the service provider class is not recognized, try rebuild
or recompile
the provider class. See https://docs.djl.ai/master/docs/development/troubleshooting.html
see if the native tensorflow libraries are actually downloaded, it should be under home directory, eg: ~/.tensorflow/cache/2.2.0-SNAPSHOT-cpu-osx-x86_64/
When you debug in IDE mode, you may need to specify in VM options
, -Djava.library.path=/path/to/libtensorflow.so/file
, the path is same as step 3. Note, you don't need this when using gradle to run.
Here are some examples on how to load TF models:
https://github.com/aws-samples/djl-demo/tree/master/pneumonia-detection https://github.com/awslabs/djl/tree/master/tensorflow/tensorflow-model-zoo/src/test/java/ai/djl/tensorflow/integration/modality/cv
The SSD Test is using a model from TF Hub. All models are in saved model bundle format.
Update:
I was able to run your file with the same setting you described.
Here is my pom.xml
file and SimpleDummyTest.java
I was able to load the model, but it seems to have some internal error when loading.
Note by inspecting this model, you need to specify the tags
to be []
during loading.
Criteria<NDList, NDList> criteria =
Criteria.builder()
.setTypes(NDList.class, NDList.class)
.optArtifactId("ai.djl.localmodelzoo:encoder")
.optOption("Tags", new String[]{})
.optTranslator(translator)
.build();
Output:
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
2020-07-22 14:47:03.484403: I external/org_tensorflow/tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-07-22 14:47:03.493307: I external/org_tensorflow/tensorflow/core/common_runtime/process_util.cc:147] Creating new thread pool with default inter op setting: 2. Tune using inter_op_parallelism_threads for best performance.
2020-07-22 14:47:03.513126: I external/org_tensorflow/tensorflow/core/common_runtime/process_util.cc:147] Creating new thread pool with default inter op setting: 2. Tune using inter_op_parallelism_threads for best performance.
2020-07-22 14:47:03.579938: I external/org_tensorflow/tensorflow/cc/saved_model/reader.cc:31] Reading SavedModel from: /Users/lawei/.djl.ai/cache/repo/model/undefined/ai/djl/localmodelzoo/10f287ae118acb18b9ad367e6ce28b61
2020-07-22 14:47:03.713259: I external/org_tensorflow/tensorflow/cc/saved_model/reader.cc:54] Reading meta graph with tags { }
2020-07-22 14:47:03.713301: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:295] Reading SavedModel debug info (if present) from: /Users/lawei/.djl.ai/cache/repo/model/undefined/ai/djl/localmodelzoo/10f287ae118acb18b9ad367e6ce28b61
2020-07-22 14:47:03.722141: I external/org_tensorflow/tensorflow/core/common_runtime/process_util.cc:147] Creating new thread pool with default inter op setting: 2. Tune using inter_op_parallelism_threads for best performance.
2020-07-22 14:47:03.979077: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:234] Restoring SavedModel bundle.
2020-07-22 14:47:04.185743: E external/org_tensorflow/tensorflow/core/grappler/optimizers/meta_optimizer.cc:563] model_pruner failed: Invalid argument: Invalid input graph.
2020-07-22 14:47:09.509059: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:364] SavedModel load for tags { }; Status: fail: Invalid argument: Tensor :0, specified in either feed_devices or fetch_devices was not found in the Graph. Took 5929119 microseconds.
Exception in thread "main" org.tensorflow.exceptions.TFInvalidArgumentException: Tensor :0, specified in either feed_devices or fetch_devices was not found in the Graph
at org.tensorflow.internal.c_api.AbstractTF_Status.throwExceptionIfNotOK(AbstractTF_Status.java:87)
Thanks much for your help @roywei I followed some of your advice and added the extra dependency to my pom file and I added the directory to my VM options ~/.tensorflow/cache/2.2.0-SNAPSHOT-cpu-osx-x86_64/
. When I looked at the directory I saw that the following files were present
LICENSE libgcc_s.1.dylib libiomp5.dylib libjnimkldnn.dylib libjnitensorflow.dylib libmklml.dylib libtensorflow.2.dylib
THIRD_PARTY_TF_JNI_LICENSES libgomp.1.dylib libjnijavacpp.dylib libjnimklml.dylib libmkldnn.0.dylib libstdc++.6.dylib libtensorflow_framework.2.dylib
I notice there is no libtensorflow.so
file, and now when I debug I get an UnsatisfiedLinkError
when trying to load the TF Engine so I guessI need to look at that
@humzaiqbal The files in your cache folder seems correct. libtensorflow.2.dylib is equivalent to libtensorflow.so on Mac. The file might corrupted. Would you please try to clean the cache folder, and run following command:
cd djl
./gradlew debugEnv -Dai.djl.default_engine=TensorFlow
It should print debug information regard tensorflow engine loading.
Hi @humzaiqbal , did you try out my uploaded files in the previous comments? It works out of the box even without setting dependency and the java.library.path
, try remove ~/.tensorflow
and ~/.djl.ai
to avoid corrupted files.
Also the model you provides seems to be a TF1 model, you need to specify .optOption("Tags", new String[]{})
during loading.
You can try this TF2 sentence encoder: https://tfhub.dev/google/universal-sentence-encoder/4, I can load it without any problem. (no need for optOption
)
Next step is for prediction, this model requires a String Tensor as input. We will look into how to support that in DJL NDArray API. (Currently Pytorch and MXNet engine does not support String Tensors)
Thanks much for your suggestions.
@frankfliu unfortunately I'm not using gradle so that won't work for me.
@roywei I tried removing my directory and restarting but that didn't work. either. Also I can't use the TF2 option because I need USE LITE specifically for my application. If a String Tensor is required then perhaps I might be better off going to the Google Java binding for Tensorflow. I am able to load the model and feed in input tensors there currently, but if that becomes possible with djl please let me know because I am a fan of the library (made it quite easy for me to load in my PyTorch model when I was trying that some time ago :) )
@humzaiqbal You don't need install anything to use gradle. the gradle command work out of box in our git repo. By the way, we are using TF java binding. If TF-java can work, DJL should just work.
Ah I wasn't using the git repo for djl I just installed the maven dependencies and didn't realize you had the gradle bash script so thats why I figured I had to have something else installed
@humzaiqbal Do you still have issue on your end?
@humzaiqbal we have added Universal Encoder Example here could you take a look if that works for you?
@humzaiqbal I am going to close the stale issue. If you have any other question, feel free to reopen it
Description
(A clear and concise description of what the bug is.) When trying to load a Tensorflow hub model into djl I get an engine not found exception. I'm running in Intelij and made sure to add the following to my
pom.xml
so that the tensorflow dependencies are there.Interestingly, I don't have this problem when I use PyTorch, I am able to load and run PyTorch models just fine when the PyTorch equivalent dependencies are added
Expected Behavior
The code should run and load the model without failure
Error Message
How to Reproduce?
(If you developed your own code, please provide a short script that reproduces the error. For existing examples, please provide link.)
I'm attaching a zip file containing a Java file
SimpleDummyTest.java
running that should reproduce it SimpleDummyTest.java.zipSteps to reproduce
(Paste the commands you ran that produced the error.)
What have you tried to solve it?
Engine.class
it would try to load the engine by doingEngine engine = provider.getEngine();
and then that failed causing it to logEnvironment Info
Please run the command
./gradlew debugEnv
from the root directory of DJL (if necessary, clone DJL first). It will output information about your system, environment, and installation that can help us debug your issue. Paste the output of the command below: I didn't run with gradlew instead ran on Intelij will manually paste all relevant info