In some cases, dedicated libraries(e.g. fugashi, ipadic) are required for Japanese tokenizers.
Currently, these libraries are not included in the inference container.
Is it possible to include these libraries or to have an option in the transformers installation?
For example, if we can rewrite the Dockerfile like this, we can handle it.
transformers[sentencepiece] → transformers[ja]
Currently, if we deploy from S3, we can work around it with requirements.txt and an empty inference.py, but if we deploy from HF Hub, we don't have a workaround.
In some cases, dedicated libraries(e.g. fugashi, ipadic) are required for Japanese tokenizers. Currently, these libraries are not included in the inference container. Is it possible to include these libraries or to have an option in the transformers installation?
For example, if we can rewrite the Dockerfile like this, we can handle it.
transformers[sentencepiece]
→transformers[ja]
Currently, if we deploy from S3, we can work around it with
requirements.txt
and an emptyinference.py
, but if we deploy from HF Hub, we don't have a workaround.Thanks!