Closed firrice closed 4 months ago
Hi, Thanks for informing me. For reproducibility, there are a few points to note: 1- It's essential to run each dataset separately. I've recently noticed that running them consecutively (all in once) in Pytorch affects the results due to initialization issues. I achieved better results by running each dataset individually. 2- The datasets mentioned are generally small, leading to considerable result variance. To address this, you can average the results from 5 experiments.
Regarding memory concerns, these arose with only a few datasets. To manage them, I employed memory-saving methods like using memmap or reducing the batch size.
Hi, the tAPE and ePRE position embedding methods proposed in ConvTran are great and novel, and i have doing several experiments for them, while i met a problem during reproduction. The detailed information are as follows: Hardware env: i7-12700K, RTX 3060 12G Software env: python3.8 pytorch2.0.1+cuda11.7 and other libs satisfy the requirements.txt
I use the code with no modification to procude the results on UEA datasets, and the comparison with which beyond to the paper can be seen as below: (1) According to the fig above, there are gaps on some datasets which are marked red("OOM" represents out of cuda memory, which doesn't need to be considered here); (2) i discovered that embedding size was set to 64 in paper rather than 16 in the code: (3) reference to the issue proposed earlier(https://github.com/Navidfoumani/ConvTran/issues/5), if the default set in code was used, about 190G cuda memory will be need, which have gone beyond the capacity of a single A5000(24GB) using in paper:
So, i wander if the reproduction problem is due to there is a separate experimental hyperparameter setting for each dataset within UEA datasets rather than sharing the default setting across all datasets in practice? or reasons else? Look forward to your reply!