With AWS Neuron SDK 2.19, when exporting a model and saving the compiled artifacts, it is impossible to reload them afterwards if the python path is different.
This basically makes shared serialization and caching impossible, since you cannot control the deployment environment (ec2 with DLAMI, sagemaker or ad-hoc end-user endpoints will all have different environments).
FileNotFoundError: Could not find a matching NEFF for your HLO in this directory. Ensure that the model you are trying to load is the same type and has the same parameters as the one you saved or call "save" on this model to reserialize it.
export the model from the new venv and compare the results
With AWS Neuron SDK 2.19, when exporting a model and saving the compiled artifacts, it is impossible to reload them afterwards if the python path is different.
This basically makes shared serialization and caching impossible, since you cannot control the deployment environment (ec2 with DLAMI, sagemaker or ad-hoc end-user endpoints will all have different environments).
Steps to reproduce
download test_tnx_llama_export.py
export the model in a venv
You should get the following exception:
The second NEFF is always different.