Closed yingmuting closed 3 years ago
@yingmuting
Did you set environment variable PYTORCH_EXTRA_LIBRARY_PATH
?
export PYTORCH_EXTRA_LIBRARY_PATH=$(python -m site | grep $VIRTUAL_ENV | awk -F"'" '{print $2}')/torch_neuron/lib/libneuron_op.so
See: https://github.com/deepjavalibrary/djl-demo/tree/master/aws/inferentia
If you won't like environment, you can set system property: PYTORCH_EXTRA_LIBRARY_PATH, but you need make sure this is set before PyTorch engine is loaded.
@yingmuting Did you set environment variable
PYTORCH_EXTRA_LIBRARY_PATH
?export PYTORCH_EXTRA_LIBRARY_PATH=$(python -m site | grep $VIRTUAL_ENV | awk -F"'" '{print $2}')/torch_neuron/lib/libneuron_op.so
See: https://github.com/deepjavalibrary/djl-demo/tree/master/aws/inferentia
If you won't like environment, you can set system property: PYTORCH_EXTRA_LIBRARY_PATH, but you need make sure this is set before PyTorch engine is loaded.
thanks,when set ENV PYTORCH_EXTRA_LIBRARY_PATH in dockerfile,now is ok,my step is: 1.on inf1 :pip install torchvision==0.9.1 torch-neuron==1.8.1.1.4.1.0 'neuron-cc[tensorflow]==1.4.1.0' --extra-index-url=https://pip.repos.neuron.amazonaws.com 2.download the libneuron_op.so ,then copy the file in dockerfile,set ENV PYTORCH_EXTRA_LIBRARY_PATH,build image
Description
(A clear and concise description of what the bug is.) eks create a cluster on inf1.2xlarge, build djl app to ecr image, and create a deployment yaml,a pod has two container, djl app container and neuron-rtd whic use image:790709498068.dkr.ecr.us-west-2.amazonaws.com/neuron-rtd:1.0.6905.0, after eks deploy, the app start success, but when access the djl inference api in a controller(named InferencePointController), get exception: ai.djl.engine.EngineException Unknown type name 'torch.torch.classes.neuron.Model'
Do i need to add 'torch_neuron/lib/libneuron_op.so' to djl app ?
Error Message
(Paste the complete error message, including stack trace.) ai.djl.engine.EngineException Unknown type name 'torch.torch.classes.neuron.Model'
Environment Info
djl 0.12.0, resnet50 on pytorch, i have compile the model to neuron model