Closed dnmca closed 1 year ago
@dnmca How you package your application. Did you repackage our .jar file?
We expect there is native/lib/tokenizer.properties
in the classpath. Can you check if this file exist after you repackage the .jar?
I am packaging the application as a JAR file.
I've checked its content, and it seems that all necessary files are present, (tokenizer.properties
as well).
What os are you running on? Is that a x86_64 or aarch64?
The exception thrown here: https://github.com/deepjavalibrary/djl/blob/master/api/src/main/java/ai/djl/util/Platform.java#L85
If you have a native/lib/tokenizer.properties
file, it should return here: https://github.com/deepjavalibrary/djl/blob/master/api/src/main/java/ai/djl/util/Platform.java#L77
Can you copy the Platform code and add few more print out and see why it fall through to the exception line?
I've printed out the value of variables urls
and systemPlatform
in the method Platfrom.detectPlatform(String engine)
:
urls
variable turns out to be empty and
systemPlatform
is "cpu-linux-x86_64:null"
initialization fails because the following condition is satisfied:
if (systemPlatform.version == null) {
throw new AssertionError("No " + engine + " version found in property file.");
}
It looks like the following resource is not being found:
String engineProp = engine + "-engine.properties";
And hence, systemPlatform.version
is set with null
.
@dnmca
systemPlatform is "cpu-linux-x86_64:null" should be OK, the expected behavior is detectPlatform()
function should return at line 77: https://github.com/deepjavalibrary/djl/blob/master/api/src/main/java/ai/djl/util/Platform.java#L77
The problem is urls
is empty, which means native/lib/tokenizer.properties
is not found in classpath.
Is tokenizer.properties
in the right location?
You might need to set a proper context class loader if your are use customized ClassLoader in your application.
Hello @frankfliu and sorry for late reply.
I've investigated a bit further, and it turns out that Vespa application is built as OSGi bundle and
that seems to be the reason why resource file tokenizer.properties
could not be located with
urls = ClassLoaderUtils.getContextClassLoader().getResources(nativeProp);
Do you think that resetting context class loader would fix this? And if not, is it possible to make tokenizers package OSGi-compatible?
Thank you for the hint! I managed to make it work with the following piece of code:
ClassLoader tccl = Thread.currentThread().getContextClassLoader();
try {
Thread.currentThread().setContextClassLoader(getClass().getClassLoader());
tokenizer = HuggingFaceTokenizer.newInstance(Paths.get(config.tokenizerPath().toString()));
} finally {
Thread.currentThread().setContextClassLoader(tccl);
}
@dnmca
You are facing a common plugin issue. It looks like your plugin class loader is different from execution class loader (ContextClassLoader), which assume all the resources are loaded at plugin initialization time. You can either move HuggingFaceTokenizer to plugin initialization or use correct ContextClassLoader as your code.
Hi @dnmca Thank you for posting a solution to your problem. I ran into the same issue and your solution worked for me as well. I think we are doing the exact same thing. I have a new issue. If I unload the plugin and load it again, I ran into this issue:
2023-03-15 21:44:30,076 [Acme Plugin Hot Deploy] ERROR com.atlassian.plugin.manager.DefaultPluginManager - There was an error loading the descriptor 'Similarity' of plugin 'com.acme'. Disabling.
com.atlassian.plugin.PluginParseException: java.lang.UnsatisfiedLinkError: Native Library /usr/local/acme/tomcat/temp/.djl.ai/tokenizers/0.13.2-0.21.0-linux-x86_64/libtokenizers.so already loaded in another classloader
at com.atlassian.plugin.module.LegacyModuleFactory.getModuleClass(LegacyModuleFactory.java:43)
I'm curious to know if you ran into it as well and if you solved it. Else, I'll take a look and post here whatever solution I find. Thanks!
You run into the same issue as: https://github.com/deepjavalibrary/djl/issues/179.
Currently this only work for PyTorch native library. I can make it available for Huggingface as well
@frankfliu I am trying to get the native helper working for huggingface as well. I'm finding that the helper works but then an error still gets thrown in at this line: https://github.com/deepjavalibrary/djl/commit/9106f958c069e4e67fd6842ef0de8f8ace4c7bca#diff-83b8cdd89c2c087ef69b441f0f73b423e93601cca9d8feed7bb711064c239951R306
Just looking at the code, it seems like this line defeats the purpose of the native helper.
Description
I'm trying to test Vespa application with custom Embedder that uses DJL's HuggingFaceTokenizer under the hood.
It is initialized internally in a straightforward manner:
Local testing of this code was successful, but when this code is being run inside Vespa docker image, I'm getting the following error:
LibUtils is trying to load tokenizers library from CLASSPATH, but it seems that it's missing.
Before running my application in Docker image, I'm building maven project that contains the following dependency:
As far as I see from
build.gradle
oftokenizers
package, it relies on some external library files. Do I understand correctly that these libraries are not part of the following dependency and should be installed manually? If that's not the case, what I'm doing wrong while applyingtokenizers
package for my use-case?Expected Behavior
I expect HuggingFaceTokenizer to initialize without errors.
Error Message
Please look in the first section.
How to Reproduce?
Unfortunately, I could not share code base.
Steps to reproduce
What have you tried to solve it?
Environment Info