Open koppor opened 2 months ago
I found out that DJL supports multiple formats. They use this converter: https://docs.djl.ai/master/extensions/tokenizers/index.html#use-djl-huggingface-model-converter. Apparently allows to convert a huggingface transformer model to TorchScript, Onnxruntime or Rust. I assume with "huggingface transformer" model, they mean safetensors. I used the script and did two conversions that worked, but the conversion failed for two other models, so there seem to be model architecture specific requirements for the conversion to work.
I haven't managed yet to find out how to add those files to the model zoo. Their documentation is hard to understand. See also https://docs.djl.ai/master/docs/development/add_model_to_model-zoo.html
For the embedding model, we opted for deep java library, which uses ONNX under the hood, because a) supported by Java eco system and b) can be converted from other formats. WE accept that the models is much smaller and faster than saftetensors, but possibly responds with slightly less quality.