deepjavalibrary / djl

An Engine-Agnostic Deep Learning Framework in Java
https://djl.ai
Apache License 2.0
4.14k stars 660 forks source link

Not able to initialize HuggingFaceTokenizer in Vespa environment #2224

Closed dnmca closed 1 year ago

dnmca commented 1 year ago

Description

I'm trying to test Vespa application with custom Embedder that uses DJL's HuggingFaceTokenizer under the hood.

It is initialized internally in a straightforward manner:

tokenizer = HuggingFaceTokenizer.newInstance(Paths.get(config.tokenizerPath().toString()));

Local testing of this code was successful, but when this code is being run inside Vespa docker image, I'm getting the following error:

com.yahoo.container.di.componentgraph.core.ComponentNode$ComponentConstructorException: Error constructing 'xlmRoberta' of type 'com.product.search.embedding.XlmRobertaEmbedder': null
Caused by: java.lang.AssertionError: No tokenizers version found in property file.
    at ai.djl.util.Platform.detectPlatform(Platform.java:85)
    at ai.djl.huggingface.tokenizers.jni.LibUtils.copyJniLibraryFromClasspath(LibUtils.java:76)
    at ai.djl.huggingface.tokenizers.jni.LibUtils.loadLibrary(LibUtils.java:66)
    at ai.djl.huggingface.tokenizers.jni.LibUtils.<clinit>(LibUtils.java:41)
    at ai.djl.huggingface.tokenizers.HuggingFaceTokenizer.newInstance(HuggingFaceTokenizer.java:146)
    at ai.djl.huggingface.tokenizers.HuggingFaceTokenizer.newInstance(HuggingFaceTokenizer.java:132)
    at ai.djl.huggingface.tokenizers.HuggingFaceTokenizer.newInstance(HuggingFaceTokenizer.java:115)

LibUtils is trying to load tokenizers library from CLASSPATH, but it seems that it's missing.

Before running my application in Docker image, I'm building maven project that contains the following dependency:

<dependency>
    <groupId>ai.djl.huggingface</groupId>
    <artifactId>tokenizers</artifactId>
    <version>0.20.0</version>
</dependency>

As far as I see from build.gradle of tokenizers package, it relies on some external library files. Do I understand correctly that these libraries are not part of the following dependency and should be installed manually? If that's not the case, what I'm doing wrong while applying tokenizers package for my use-case?

Expected Behavior

I expect HuggingFaceTokenizer to initialize without errors.

Error Message

Please look in the first section.

How to Reproduce?

Unfortunately, I could not share code base.

Steps to reproduce

What have you tried to solve it?

Environment Info

frankfliu commented 1 year ago

@dnmca How you package your application. Did you repackage our .jar file?

We expect there is native/lib/tokenizer.properties in the classpath. Can you check if this file exist after you repackage the .jar?

dnmca commented 1 year ago

I am packaging the application as a JAR file. I've checked its content, and it seems that all necessary files are present, (tokenizer.properties as well).

frankfliu commented 1 year ago

What os are you running on? Is that a x86_64 or aarch64?

The exception thrown here: https://github.com/deepjavalibrary/djl/blob/master/api/src/main/java/ai/djl/util/Platform.java#L85

If you have a native/lib/tokenizer.properties file, it should return here: https://github.com/deepjavalibrary/djl/blob/master/api/src/main/java/ai/djl/util/Platform.java#L77

Can you copy the Platform code and add few more print out and see why it fall through to the exception line?

dnmca commented 1 year ago

I've printed out the value of variables urls and systemPlatform in the method Platfrom.detectPlatform(String engine):

urls variable turns out to be empty and systemPlatform is "cpu-linux-x86_64:null"

initialization fails because the following condition is satisfied:

if (systemPlatform.version == null) {
    throw new AssertionError("No " + engine + " version found in property file.");
}

It looks like the following resource is not being found:

String engineProp = engine + "-engine.properties";

And hence, systemPlatform.version is set with null.

frankfliu commented 1 year ago

@dnmca systemPlatform is "cpu-linux-x86_64:null" should be OK, the expected behavior is detectPlatform() function should return at line 77: https://github.com/deepjavalibrary/djl/blob/master/api/src/main/java/ai/djl/util/Platform.java#L77

The problem is urls is empty, which means native/lib/tokenizer.properties is not found in classpath. Is tokenizer.properties in the right location?

You might need to set a proper context class loader if your are use customized ClassLoader in your application.

dnmca commented 1 year ago

Hello @frankfliu and sorry for late reply.

I've investigated a bit further, and it turns out that Vespa application is built as OSGi bundle and that seems to be the reason why resource file tokenizer.properties could not be located with

urls = ClassLoaderUtils.getContextClassLoader().getResources(nativeProp);

Do you think that resetting context class loader would fix this? And if not, is it possible to make tokenizers package OSGi-compatible?

dnmca commented 1 year ago

Thank you for the hint! I managed to make it work with the following piece of code:

ClassLoader tccl = Thread.currentThread().getContextClassLoader();
try {
    Thread.currentThread().setContextClassLoader(getClass().getClassLoader());
    tokenizer = HuggingFaceTokenizer.newInstance(Paths.get(config.tokenizerPath().toString()));
} finally {
    Thread.currentThread().setContextClassLoader(tccl);
}
frankfliu commented 1 year ago

@dnmca

You are facing a common plugin issue. It looks like your plugin class loader is different from execution class loader (ContextClassLoader), which assume all the resources are loaded at plugin initialization time. You can either move HuggingFaceTokenizer to plugin initialization or use correct ContextClassLoader as your code.

carlos-aguayo commented 1 year ago

Hi @dnmca Thank you for posting a solution to your problem. I ran into the same issue and your solution worked for me as well. I think we are doing the exact same thing. I have a new issue. If I unload the plugin and load it again, I ran into this issue:

2023-03-15 21:44:30,076 [Acme Plugin Hot Deploy] ERROR com.atlassian.plugin.manager.DefaultPluginManager - There was an error loading the descriptor 'Similarity' of plugin 'com.acme'. Disabling.
com.atlassian.plugin.PluginParseException: java.lang.UnsatisfiedLinkError: Native Library /usr/local/acme/tomcat/temp/.djl.ai/tokenizers/0.13.2-0.21.0-linux-x86_64/libtokenizers.so already loaded in another classloader
    at com.atlassian.plugin.module.LegacyModuleFactory.getModuleClass(LegacyModuleFactory.java:43)

I'm curious to know if you ran into it as well and if you solved it. Else, I'll take a look and post here whatever solution I find. Thanks!

frankfliu commented 1 year ago

You run into the same issue as: https://github.com/deepjavalibrary/djl/issues/179.

Currently this only work for PyTorch native library. I can make it available for Huggingface as well

tjcarroll11 commented 1 year ago

@frankfliu I am trying to get the native helper working for huggingface as well. I'm finding that the helper works but then an error still gets thrown in at this line: https://github.com/deepjavalibrary/djl/commit/9106f958c069e4e67fd6842ef0de8f8ace4c7bca#diff-83b8cdd89c2c087ef69b441f0f73b423e93601cca9d8feed7bb711064c239951R306

Just looking at the code, it seems like this line defeats the purpose of the native helper.