Description
I'm trying to serve an embedding model [FastText] in triton-server using python as its backend. The external dependencies are just fasttext module which is inturn dependent on numpy. I have created a custom execution environment as mentioned here.
The problem is that, I'm facing the following error while running the triton server as a docker container,
Expected behavior
The container exits by saying error: creating server: Internal - failed to load all models.
Below is a segment of a log generated by the triton container,
Description I'm trying to serve an embedding model [FastText] in triton-server using python as its backend. The external dependencies are just fasttext module which is inturn dependent on numpy. I have created a custom execution environment as mentioned here.
The problem is that, I'm facing the following error while running the triton server as a docker container,
Triton Information I'm running the triton container on a M2 chip and its image is nvcr.io/nvidia/tritonserver:24.09-pyt-python-py3.
To Reproduce
conda create -k -y -n ${CONDA_ENV_NAME} python=3.10.12
export PYTHONNOUSERSITE=True
pip3 install fasttext
conda pack -o ./model-repository/fast-text-service/${CONDA_ENV_NAME}.tar.gz
models.py
file as required by the trition server.models.py
requirements.txt:
configpb.txt
Expected behavior The container exits by saying
error: creating server: Internal - failed to load all models
. Below is a segment of a log generated by the triton container,