Open avijit9 opened 4 years ago
How to exactly reproduce the results reported in the paper? It seems that the learning rate and the epoch no is different in the paper and the config files. How can I reproduce the results? Please help.
Thanks.
I have a similar result here. Did you solve it?
@ruiyan1995 no. The performance fluctuates from one run to another even when the seed is fixed :(
I figured out the problem. The RoIAlign layer uses atomicAdd
operation during backward pass operation which is a non-deterministic operation in PyTorch.
I figured out the problem. The RoIAlign layer uses
atomicAdd
operation during backward pass operation which is a non-deterministic operation in PyTorch.
Can you provide more explanation about it? And how to solve it?
There are some operations which cannot be made deterministic (at least for now) in PyTorch due to some CUDA functions these layers use. You can refer to this page: https://pytorch.org/docs/stable/notes/randomness.html
I think it cannot be fixed. Data is also less. So, the variance in performance is also high.
There are some operations which cannot be made deterministic (at least for now) in PyTorch due to some CUDA functions these layers use. You can refer to this page: https://pytorch.org/docs/stable/notes/randomness.html
I think it cannot be fixed. Data is also less. So, the variance in performance is also high.
Thank you so much. But, I guess it is easy for models to overfit on CAD due to the small size. I have ever met the similar case in some two-stage methods that do not use the RoIAlign layer.
I have ever met a similar case in some two-stage methods that do not use the RoIAlign layer.
May be those models use some other kind of layers which are non-deterministic. Can you provide some example?
@wjchaoGit can you please help me out here? For CAD, I am getting 88.81 % accuracy, whereas the reported performance is 91%.
The collective dataset is small, and the diversity of training samples is poor. It's easy to overfit. I am afraid that the performance may fluctuate due to the non-deterministic of deeplearning architecture.
hi besides CAD,have you reproduced the result of Volleyball Dataset?
How to exactly reproduce the results reported in the paper? It seems that the learning rate and the epoch no is different in the paper and the config files. How can I reproduce the results? Please help.
Thanks.