pytorch / serve

Serve, optimize and scale PyTorch models in production
https://pytorch.org/serve/
Apache License 2.0
4.19k stars 857 forks source link

failed to load model.py file in handler #1324

Open SeanWangJS opened 2 years ago

SeanWangJS commented 2 years ago

Please have a look at FAQ's and Troubleshooting guide, your query may be already addressed.

Your issue may already be reported! Please search on the issue tracker before creating one.

Context

Your Environment

Expected Behavior

load the model.py file properly.

Current Behavior

raise RuntimeError("Missing the model.py file") in vgg_handler example.

Possible Solution

let model_file in _load_pickled_model method be os.path.basename(model_file).

Steps to Reproduce

  1. archive model by using relatve path, like torch-model-archiver --model-name vgg16 --version 0.1 --model-file .\examples\image_classifier\vgg_16\model.py --serialized-file .....cache\torch\hub\checkpoints\vgg16-397923af.pth --handler .\examples\image_classifier\vgg_16\vgg_handler.py --export-path .\model_store\ -f
  2. start serve torchserve --start --ncs --model-store model_store --models vgg16.mar --ts-config .\examples\config.properties ...

Failure Logs [if any]

2021-11-16 10:51:41,653 [INFO ] W-9000-vgg16_0.1-stdout MODEL_LOG - manifest: {'createdOn': '15/11/2021 21:17:23', 'runtime': 'python', 'model': {'modelName': 'vgg16', 'serializedFile': 'vgg16-397923af.pth', 'handler': 'vgg_handler.py', 'modelFile': '.\examples\image_classifier\vgg_16\model.py', 'modelVersion': '0.1'}, 'archiverVersion': '0.4.2'} 2021-11-16 10:51:41,654 [INFO ] W-9000-vgg16_0.1-stdout MODEL_LOG - Loading eager model 2021-11-16 10:51:41,654 [INFO ] W-9000-vgg16_0.1-stdout MODEL_LOG - Loading model from C:\Users\wangx\AppData\Local\Temp\models\8cc14a8fd79b420ebb1d79bf67a388e3.\examples\image_classifier\vgg_16\model.py 2021-11-16 10:51:41,655 [INFO ] W-9000-vgg16_0.1-stdout MODEL_LOG - Backend worker process died. 2021-11-16 10:51:41,655 [INFO ] W-9000-vgg16_0.1-stdout MODEL_LOG - Traceback (most recent call last): 2021-11-16 10:51:41,655 [INFO ] W-9000-vgg16_0.1-stdout MODEL_LOG - File "C:\Users\wangx\anaconda3\Lib\site-packages\ts\model_service_worker.py", line 191, in 2021-11-16 10:51:41,655 [INFO ] W-9000-vgg16_0.1-stdout MODEL_LOG - worker.run_server() 2021-11-16 10:51:41,655 [INFO ] W-9000-vgg16_0.1-stdout MODEL_LOG - File "C:\Users\wangx\anaconda3\Lib\site-packages\ts\model_service_worker.py", line 163, in run_server 2021-11-16 10:51:41,656 [INFO ] W-9000-vgg16_0.1-stdout MODEL_LOG - self.handle_connection(cl_socket) 2021-11-16 10:51:41,656 [INFO ] W-9000-vgg16_0.1-stdout MODEL_LOG - File "C:\Users\wangx\anaconda3\Lib\site-packages\ts\model_service_worker.py", line 124, in handle_connection 2021-11-16 10:51:41,656 [INFO ] W-9000-vgg16_0.1-stdout MODEL_LOG - service, result, code = self.load_model(msg) 2021-11-16 10:51:41,656 [INFO ] W-9000-vgg16_0.1-stdout MODEL_LOG - File "C:\Users\wangx\anaconda3\Lib\site-packages\ts\model_service_worker.py", line 96, in load_model 2021-11-16 10:51:41,656 [INFO ] W-9000-vgg16_0.1-stdout MODEL_LOG - batch_size, envelope, limit_max_image_pixels) 2021-11-16 10:51:41,657 [INFO ] W-9000-vgg16_0.1-stdout MODEL_LOG - File "c:\users\wangx\anaconda3\lib\site-packages\ts\model_loader.py", line 112, in load 2021-11-16 10:51:41,657 [INFO ] W-9000-vgg16_0.1-stdout MODEL_LOG - initialize_fn(service.context) 2021-11-16 10:51:41,657 [INFO ] W-9000-vgg16_0.1-stdout MODEL_LOG - File "c:\users\wangx\anaconda3\lib\site-packages\ts\torch_handler\vision_handler.py", line 20, in initialize 2021-11-16 10:51:41,657 [INFO ] W-9000-vgg16_0.1-stdout MODEL_LOG - super().initialize(context) 2021-11-16 10:51:41,657 [INFO ] W-9000-vgg16_0.1-stdout MODEL_LOG - File "c:\users\wangx\anaconda3\lib\site-packages\ts\torch_handler\base_handler.py", line 66, in initialize 2021-11-16 10:51:41,658 [INFO ] W-9000-vgg16_0.1-stdout MODEL_LOG - self.model = self._load_pickled_model(model_dir, model_file, model_pt_path) 2021-11-16 10:51:41,658 [INFO ] W-9000-vgg16_0.1-stdout MODEL_LOG - File "C:\Users\wangx\AppData\Local\Temp\models\8cc14a8fd79b420ebb1d79bf67a388e3\vgg_handler.py", line 21, in _load_pickled_model 2021-11-16 10:51:41,658 [INFO ] W-9000-vgg16_0.1-stdout MODEL_LOG - raise RuntimeError("Missing the model.py file") 2021-11-16 10:51:41,658 [INFO ] W-9000-vgg16_0.1-stdout MODEL_LOG - RuntimeError: Missing the model.py file

toretak commented 2 years ago

Hi @SeanWangJS, can you try to copy model file into model-archiver workdir .. and pack it like that

torch-model-archiver --model-name vgg16 --version 0.1 --model-file model.py --serialized-file 
.....cache\torch\hub\checkpoints\vgg16-397923af.pth --handler .\examples\image_classifier\vgg_16\vgg_handler.py
 --export-path .\model_store\ -f

It seems that archiver doesn't support relative path for model file. (you can also specify platform - win10? and environment - no docker?) And you can inspect mar file directly (unpack it like a standard zipfile) and check if the model.py is there. if it's there, you can fix model path in MAR-INF/MANIFEST.json