Open Claudia-hue opened 2 years ago
Hello, i have the same problem of Claudia-hue, if someone could send me the files of the output directory, Thank you
Weights of the trained models for segmentation and EF prediction have already been released here
For example, to load the weights into the EF model:
EJECTION_FRACTION_WEIGHTS_PATH = 'https://github.com/echonet/dynamic/releases/download/v1.0.0/r2plus1d_18_32_2_pretrained.pt'
weights_destination_dir = '/path/to/downloaded/weights'
model = torchvision.models.video.r2plus1d_18(pretrained=False)
model.fc = torch.nn.Linear(model.fc.in_features, 1)
device = torch.device("cuda")
model = torch.nn.DataParallel(model)
model.to(device)
checkpoint = torch.load(os.path.join(weights_destination_dir, os.path.basename(EJECTION_FRACTION_WEIGHTS_PATH)))
model.load_state_dict(checkpoint['state_dict'])
Hi, I tried loading the pretrained model with the above script and got the following error:
_File "dynamic/pretrained_model/download_pretrained_model.py", line 18, in
model.load_state_dict(checkpoint['state_dict']) File "envs/EchoNetDyn/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1223, in load_state_dict raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format( RuntimeError: Error(s) in loading state_dict for DataParallel: Missing key(s) in state_dict: "module.stem.0.weight", "module.stem.1.weight", "module.stem.1.bias", "module.stem.1.running_mean", "module.stem.1.runningvar", "module.stem.3.weight", [and it goes on and on]
I also tried running inference on the pretrained model (having put one video of the test set in the a4c-video-dir ) by using this command:
echonet segmentation --data_dir=dynamic/a4c-video-dir --output=dynamic/segmented_videos --pretrained --weights=dynamic/output/segmentation/deeplabv3_resnet50_random/deeplabv3_resnet50_random.pt --save_video
and got a similar error:
RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format( RuntimeError: Error(s) in loading state_dict for DataParallel: Missing key(s) in state_dict: "module.aux_classifier.0.weight", "module.aux_classifier.1.weight", "module.aux_classifier.1.bias", "module.aux_classifier.1.running_mean", "module.aux_classifier.1.running_var", "module.aux_classifier.4.weight", "module.aux_classifier.4.bias".
Could you maybe help me with this? What am I doing wrong?
@chrilouk From the error message I guess there's a mismatch between the model architecture loaded from the torchvision module and the pretrained weights that you are trying to load into it. Can you please double check that you are loading the correct weights (the ones that correspond to the model architecture that you have loaded)?
@GKalliatakis thank you for your fast response. I used 'deeplabv3_resnet50' for the segmentation and I got the same error message. Should I be using something else? However, when I try to run 'video' I use 'r2plus1d_18' as model and 'r2plus1d_18_32_2_pretrained.pt' as the model weights, I do not get this type of error. Instead I run into another error so here comes my second question: How can I run 'segmentation' and 'video' only for inference? Because if I do not put any training files into the data directory I get the following error:
File "/lib/python3.9/site-packages/echonet/utils/video.py", line 141, in run mean, std = echonet.utils.get_mean_and_std(echonet.datasets.Echo(root=data_dir, split="train")) File "/lib/python3.9/site-packages/echonet/utils/init.py", line 104, in get_mean_and_std dataloader = torch.utils.data.DataLoader( File "/anaconda3/envs/EchoNetDyn/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 266, in init sampler = RandomSampler(dataset, generator=generator) # type: ignore File "/anaconda3/envs/EchoNetDyn/lib/python3.9/site-packages/torch/utils/data/sampler.py", line 103, in init raise ValueError("num_samples should be a positive integer " ValueError: num_samples should be a positive integer value, but got num_samples=0
Hello. I'm tryng to use EchoNet but I have computational problems (each epoch needs more than 5 hours), so is it possible to have the output directory? In that way, I could use the trained network to do transfer learning. Thank you!