Closed victorsoda closed 4 years ago
@zhenhuascut @victorsoda "'models/epo64.tar' " is not provided in this branch........
Thanks, I have solved it.
Still don't see epo64.tar ... It would be great if you could upload it. Many thanks!
Still don't see too... Would you upload it then I could reproduce your work? Thanks a lot.
I also got the same trouble, but at 51 epoch. Can you help me solve it? Thanks a lot.
Setting train/epoch to 0 in data/model/dcrnn_test_config.yaml solves the problem for me.
Setting train/epoch to 0 in data/model/dcrnn_test_config.yaml solves the problem.
Thanks! I tried successfully.
Hi everyone, I am sorry I have been unable to reply - @htn274 is correct - the epoch is intended to be the checkpoint from where to resume and I have since then shut down the server (I am a poor student 🥼) and haven't been able to retrieve the weights at epoch 64 - if you just set it to 0 and train, you will be able to reproduce the results
I don't believe setting train/epoch to 0 is a solution for this problem. As you train the model by the code, the trained models are saved as models/epo0.tar, epo1.tar,... etc. Therefore if you run "run_demo" with train/epoch = 0, it means you run demo with the trained model only with the first epoch. So I ask you to add the best model (epoXX.tar) at the models/ for METR-LA, and PEMS-BAY then we can test with these models
@semink that is incorrect - the model only tries to load existing weights if epoch > 0, so by setting epoch=0 will do the job as @baosws helpfully pointed out.
I am closing this issue for now - the solution is to train it and once it has trained, set the correct epoch number in config.yml and that should work
@chnsh this didn't work for me. I still get the same error
@chnsh actually I ended up doing the change in data/model/pretrained/METR-LA/config.yaml and that did it
I was running the script of run_demo_pytorch.py using the command:
python run_demo_pytorch.py --config_filename=data/model/pretrained/METR-LA/config.yaml
This is what I got: Traceback (most recent call last): File "run_demo_pytorch.py", line 33, in
run_dcrnn(args)
File "run_demo_pytorch.py", line 18, in run_dcrnn
supervisor = DCRNNSupervisor(adj_mx=adj_mx, **supervisor_config)
File "/home/cyd/DCRNN_PyTorch/model/pytorch/dcrnn_supervisor.py", line 50, in init
self.load_model()
File "/home/cyd/DCRNN_PyTorch/model/pytorch/dcrnn_supervisor.py", line 93, in load_model
assert os.path.exists('models/epo%d.tar' % self._epoch_num), 'Weights at epoch %d not found' % self._epoch_num
AssertionError: Weights at epoch 64 not found
Could you please upload the 'models/epo64.tar' to the repo? I hope to reproduce the MAE results demonstrated in README. Thx!