saichethan-a commented 7 months ago

I have the following model_fn where I'm trying to load 2 pth files using torch sagemaker is able to load one .pth file but when coming to the second file it fails I've also tried deploying only the second .pth file but it fails.But the model seems to load when its downloaded from source but when i download it from source and make it fetch from s3 it fails again.Both the files are in model_dir Here is my model_fn def model_fn(model_dir): device = torch.device("cuda" if torch.cuda.is_available() else "cpu") print(device) print("Model Loading...") det_model = db_resnet50(pretrained=False, pretrained_backbone=False) det_model_path = os.path.join(model_dir, 'db_resnet50-ac60cadc.pt') det_params = torch.load(det_model_path, map_location=device) det_model.load_state_dict(det_params) print("loading second file") model = crnn_vgg16_bn(pretrained=True, pretrained_backbone=False)

failing on this step

rec_model_path = os.path.join(model_dir, 'crnn_vgg16_bn-9762b0b0.pt')
reco_params = torch.load(rec_model_path, map_location=device)
reco_model.load_state_dict(reco_params)
model = ocr_predictor(det_arch=det_model, reco_arch=reco_model, pretrained=False)
model.to(device=device)
print("model_loaded")
return model

windson commented 4 months ago

The model_dir is a read only directory. Are you sure you were able to download one file and checked it exists?

Also curious to know how does your huggingface_model = HuggingFaceModel( ... ) invocation looks like.

windson commented 4 months ago

Can you try the way it is demonstrated in this notebook? But heads up, it uses async inference and doesnot use HuggingFaceModel. However you can use the similar approach for realtime inference and still download models from HuggingFace to deploy them to SageMaker.

aws / sagemaker-huggingface-inference-toolkit

Sagemaker inference not loading model weight from s3 #116

failing on this step