Closed bunnyveil closed 5 months ago
Thank you for your interest in our work!
The code currently open-sourced from the repository aligns with the paper's current version. Our models are all trained and tested using a single RTX 4090, which may differ across hardware platforms. Also please refer to this issue and the explanation here.
Feel free to evaluate our open-sourced weight for ModelNet40.
May I ask if you can directly test with the weights you provide? The accuracy of the test in pointmamba is only 7%~8%, and the weight files provided by them are also used in point-MAE, with about 93.5%
并且当我训练的时候
会nan 同样的数据,在mae中可以正常训练 请问可以直接使用代码中的超参数吗,需要针对性调整?
May I ask if you can directly test with the weights you provide? The accuracy of the test in pointmamba is only 7%~8%, and the weight files provided by them are also used in point-MAE, with about 93.5%
I just re-cloned the repository, downloaded the open-sourced checkpoint, and achieved the exact same results as reported. The test command should be CUDA_VISIBLE_DEVICES=0 python main.py --test --config cfgs/finetune_modelnet.yaml --ckpts modelnet_scratch.pth
.
并且当我训练的时候 会nan 同样的数据,在mae中可以正常训练 请问可以直接使用代码中的超参数吗,需要针对性调整?
I'm sorry, we never encountered an issue with NaN.
并且当我训练的时候 会nan 同样的数据,在mae中可以正常训练 请问可以直接使用代码中的超参数吗,需要针对性调整?
I'm sorry, we never encountered an issue with NaN.
Thank you very much for your answer~
I am closing this issue. Please feel free to reopen it if necessary.
@formerlya Hello, may I ask how you solved NaN problem?
I first ran the classification task of ModelNet40 training-from-scratch in pointmamba on four 3080Ti, and selected the your pretrain.pth file to run the train from pre-trained classification of ModelNet40. All these are implemented in accordance with the parameters and steps described in the paper, but the final classification accuracy is only 93.0713%, while the best classification accuracy mentioned in the paper is 93.6%.We ran it many times(pic 1 with voting and pic 2,3,4 without voting) I have tested other classification and segmentation tasks, and the results obtained in accordance with the parameters in the paper have decreased by about 1-4 percentage points compared with those in the paper. I wonder if this is a problem or if the model parameters in the article are not updated in real time?