Open AlEscher opened 8 hours ago
@AlEscher
Can you use our built-in TextEmbeddingTranslator
?
The following code works for me for this model:
Criteria<String, float[]> criteria =
Criteria.builder()
.setTypes(String.class, float[].class)
.optModelPath(path)
.optEngine("PyTorch")
.optTranslatorFactory(new TextEmbeddingTranslatorFactory())
.optProgress(new ProgressBar())
.build();
@AlEscher
Your error may caused by you set useTokenTypes = true
Description
I am trying to convert a Huggingface model to make it compatible with DJL. My goal is to use
djl-convert
to convert the model and be able to load it locally. Then I want to generate code-embeddings for Java code, using e.g. Codebert. I randjl-convert -m microsoft/codebert-base -o models/codebert
and then used this code to import the model:The translator is implemented like this:
When generating the embeddings, the model fails with:
What am I doing wrong? Is there a better approach to load a model from huggingface?
codebert-base
does not seem to be available in the Model Zoo.Expected Behavior
The convert tool produces a model that can be loaded locally and has a working
forward
methodError Message
How to Reproduce?
See provided code above
Steps to reproduce
(Paste the commands you ran that produced the error.)
djl-convert
tool as described aboveWhat have you tried to solve it?
I tried many different ways of getting a model from huggingface to work locally, this approach seems to be the intended way according to https://djl.ai/extensions/tokenizers/#convert-huggingface-model-to-torchscript
Environment Info
Please run the command
./gradlew debugEnv
from the root directory of DJL (if necessary, clone DJL first). It will output information about your system, environment, and installation that can help us debug your issue. Paste the output of the command below: