Closed JohnsonQi closed 4 years ago
Hi @JohnsonQi,
Glad to hear that you're using the GQ-CNN package, and apologies for the trouble.
The training should definitely take longer than that. Can you share models/GQCNN-2.0/training.log
?
Thanks, Vishal
Hi @JohnsonQi,
Glad to hear that you're using the GQ-CNN package, and apologies for the trouble.
The training should definitely take longer than that. Can you share
models/GQCNN-2.0/training.log
?Thanks, Vishal
Hi @visatish ,
Thanks for your reply! Here is my training log, and I can't figure out where is wrong. I set "train_pct"=0.8,"totoal_pct"=1. training.log
Kind regards, Johnson
Hi @JohnsonQi,
I noticed that you're having the same issue as https://github.com/BerkeleyAutomation/gqcnn/issues/99, which was resolved over email. It turned out that the benchmark we provided was actually trained on 50 epochs instead of the default 25. I will push a fix for that shortly.
It does seem like you are training on the entire dataset (26283 steps * 64 samples/step(bsz) * 1.25(account for training split) = 2102640 samples
). Can you try training for 50 epochs? I'm not sure where you got 5 epochs from, unless you manually lowered it.
In the meanwhile, I will try to replicate the result again on my end, although I did replicate it earlier this year for the other issue.
Thanks, Vishal
System information
Describe the result you are trying to replicate (https://berkeleyautomation.github.io/gqcnn/index.html). I used train_dex-net_2.0.yaml to train the gqcnn, but I didn't get the expected results. It's really strange that the training only took 30 minutes for 5 epochs on a full dex-net 2.0 dataset you provided. (https://berkeley.app.box.com/s/6mnb2bzi5zfa7qpwyn7uq5atb7vbztng/folder/25803680060)
How can I fix this problem?