aws / sagemaker-huggingface-inference-toolkit

Apache License 2.0
241 stars 60 forks source link

Sagemaker inference not loading model weight from s3 #116

Open saichethan-a opened 7 months ago

saichethan-a commented 7 months ago

I have the following model_fn where I'm trying to load 2 pth files using torch sagemaker is able to load one .pth file but when coming to the second file it fails I've also tried deploying only the second .pth file but it fails.But the model seems to load when its downloaded from source but when i download it from source and make it fetch from s3 it fails again.Both the files are in model_dir Here is my model_fn def model_fn(model_dir): device = torch.device("cuda" if torch.cuda.is_available() else "cpu") print(device) print("Model Loading...") det_model = db_resnet50(pretrained=False, pretrained_backbone=False) det_model_path = os.path.join(model_dir, 'db_resnet50-ac60cadc.pt') det_params = torch.load(det_model_path, map_location=device) det_model.load_state_dict(det_params) print("loading second file") model = crnn_vgg16_bn(pretrained=True, pretrained_backbone=False)

failing on this step

rec_model_path = os.path.join(model_dir, 'crnn_vgg16_bn-9762b0b0.pt')
reco_params = torch.load(rec_model_path, map_location=device)
reco_model.load_state_dict(reco_params)
model = ocr_predictor(det_arch=det_model, reco_arch=reco_model, pretrained=False)
model.to(device=device)
print("model_loaded")
return model
windson commented 4 months ago

The model_dir is a read only directory. Are you sure you were able to download one file and checked it exists?

Also curious to know how does your huggingface_model = HuggingFaceModel( ... ) invocation looks like.

windson commented 4 months ago

Can you try the way it is demonstrated in this notebook? But heads up, it uses async inference and doesnot use HuggingFaceModel. However you can use the similar approach for realtime inference and still download models from HuggingFace to deploy them to SageMaker.