Closed XIAO1HAI closed 1 year ago
Hello
How many 2080TIs are you using? In experience, it took three days with two TitanXPs on Pascal-5i as stated in README.md. Therefore, I feel seven days for (presumably) a single 2080TI is reasonably comparable with my experiment setting. I observed the trade-off that the correlation-based methods generalize better (=show powerful performance on unseen classes) but converge slower assumingly due to the high-dimensional input complexity. I hope it helps. Thank you!
Best, Dahyun
I was using two 2080ti graphics cards on the Pascal-5i data set, batch set to 12, and training 500epoch took about 6-7 days, so I was wondering, is there some setup missing? Therefore, I used some acceleration techniques of pytorch-lightning (accumulate_grad_batches, precision=16, etc.), and the effect will be better, but not obvious. Therefore, I would like to consult you for some good suggestions
Hello again
I uploaded the codebase as-is and didn't tweak any computing acceleration techniques. What I would suggest is to match the pytorch and pytorch-lightning versions as the same as mine with the provided environment.yml. The codebase isn't missing anything, thus I cannot give more concrete suggestions.
Best regards, Dahyun
Ok, thanks again for the answer,
Best regards,I wish you all the best in your scientific research! XIAOHAI
Thank you, all the best to you too!
p.s. If you find no room for improvement on the hardware acceleration, then another workaround is to skip some of the validation phases, which occur every (training) epoch in this implementation.
Ok, I'll think about it
Hello, I bother you again.
Recently, I have been using 2080ti to reproduce the paper in this aspect, but I found that the training is very slow, which takes about seven days a week. What is the reason? I hope you can give me some advice