Running 'run_all_baseline_finetuning.sh': KeyError: 'model'

L-Daniel-ee commented 2 years ago

Hi @arunmallya ：

Running Platform: GoogleColab Environment: Python3.6 torch==0.2.0.post3 torchvision==0.1.9 torchnet, tqdm I have changed "raw_input()" to "input()" in network.py. When I try to run the program, the following error occurs.

Traceback (most recent call last): File "main.py", line 393, in main() File "main.py", line 331, in main model = ckpt['model'] KeyError: 'model'

The file structure below the project is the same as your GitHub commit, where the file structure in checkpoints is shown below.

The contents of the imagenet folder are shown below.

The files vgg16.pt, vgg16bn.pt, resnet50.pt, densenet121.pt were downloaded in the same environment with the following code. vgg16 = torchvision.models.vgg16(pretrained=True) torch.save(vgg16.state_dict(), "/content/drive/MyDrive/packnet/vgg16.pt"). The other model files are downloaded in the same way.

How should I fix this error? Is there a mistake in the way I download the model? Or is the problem with the network.py file?

One more question. Is there a requirement for the order in which the .sh files are run? I am talking about the following .sh files. run_all_baseline_finetuning.sh run_all_lwf.sh run_all_sequence.sh

Thanks

arunmallya commented 2 years ago

Hi, what command are you trying to run? The pretrained networks should be automatically loaded inside the modified networks (e.g. https://github.com/arunmallya/packnet/blob/master/src/networks.py#L32)

L-Daniel-ee commented 2 years ago

When I try to do "!. /src/run_all_baseline_finetuning.sh", it prompts me to find it in ". /checkpoints/imagenet" there is no vgg16.pt file needed to run the program, so I downloaded the four model files in advance with the following code and placed them in the imagenet folder.

vgg16 = torchvision.models.vgg16(pretrained=True) torch.save(vgg16.state_dict(), "/content/drive/MyDrive/packnet/vgg16.pt").

The four model files were downloaded with torchvision==0.1.9. But I was prompted again with "KeyError: 'model'"

L-Daniel-ee commented 2 years ago

There is one more question I need to bother you with. README.md says: _runall.sh when training. I understand it as running the following three .sh files. run_all_baseline_finetuning.sh run_all_lwf.sh run_all_sequence.sh Is this wrong? Is there an order in which these three files are run?

MagicHealer commented 2 years ago

Have you replicated successfully this repo, @L-Daniel-ee ?

arunmallya / packnet

Running 'run_all_baseline_finetuning.sh': KeyError: 'model' #5