Unable to reproduce the reported AUC score

mooerccx commented 5 months ago

Dear authors, First, I would like to express my appreciation for your work and for making the code publicly available on GitHub. However, I have encountered an issue while trying to reproduce the results reported in your paper. I followed the instructions provided and used pip to deploy ScanNet in a new conda environment. I did not use MSA and only used the default settings. Despite my efforts, I was only able to achieve an AUC score of 0.52, which is significantly lower than the AUC score of 0.639 reported in your paper.（python train.py） I am struggling to understand the reason behind this large discrepancy in performance. I would greatly appreciate it if you could provide some guidance on the following:

Are there any additional steps or configurations required to achieve the reported AUC score? Could you please clarify if the reported AUC score of 0.639 was obtained using MSA or any other specific settings?

I would be grateful for any insights or suggestions you could offer to help me reproduce the results and improve my understanding of your work. Thank you in advance for your time and assistance. Best regards,

jertubiana commented 5 months ago

Thank you for your interest in our research.A verification: did you set the variable « check » to False in the train.py script?Le 3 mai 2024 à 11:31 AM, mooerccx @.***> a écrit : Dear authors, First, I would like to express my appreciation for your work and for making the code publicly available on GitHub. However, I have encountered an issue while trying to reproduce the results reported in your paper. I followed the instructions provided and used pip to deploy ScanNet in a new conda environment. I did not use MSA and only used the default settings. Despite my efforts, I was only able to achieve an AUC score of 0.52, which is significantly lower than the AUC score of 0.639 reported in your paper.（python train.py） I am struggling to understand the reason behind this large discrepancy in performance. I would greatly appreciate it if you could provide some guidance on the following: Are there any additional steps or configurations required to achieve the reported AUC score? Could you please clarify if the reported AUC score of 0.639 was obtained using MSA or any other specific settings? I would be grateful for any insights or suggestions you could offer to help me reproduce the results and improve my understanding of your work. Thank you in advance for your time and assistance. Best regards,

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you are subscribed to this thread.Message ID: @.***>

mooerccx commented 5 months ago

Thank you for your interest in our research.A verification: did you set the variable « check » to False in the train.py script?Le 3 mai 2024 à 11:31 AM, mooerccx @.> a écrit : Dear authors, First, I would like to express my appreciation for your work and for making the code publicly available on GitHub. However, I have encountered an issue while trying to reproduce the results reported in your paper. I followed the instructions provided and used pip to deploy ScanNet in a new conda environment. I did not use MSA and only used the default settings. Despite my efforts, I was only able to achieve an AUC score of 0.52, which is significantly lower than the AUC score of 0.639 reported in your paper.（python train.py） I am struggling to understand the reason behind this large discrepancy in performance. I would greatly appreciate it if you could provide some guidance on the following: Are there any additional steps or configurations required to achieve the reported AUC score? Could you please clarify if the reported AUC score of 0.639 was obtained using MSA or any other specific settings? I would be grateful for any insights or suggestions you could offer to help me reproduce the results and improve my understanding of your work. Thank you in advance for your time and assistance. Best regards, —Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you are subscribed to this thread.Message ID: @.>

Thank you very much for your guidance. I was able to achieve an AUC of 0.673 by modifying train=False in train.py. However, I have found that I cannot reproduce this result by retraining using your code no matter how I adjust the hyperparameters. This is very confusing to me. Could you please tell me what parameters need to be modified?

mooerccx commented 4 months ago

Over the past few weeks, I have continued to experiment with numerous parameter combinations, but I have been unable to achieve the level of performance reported for your model, only managing to obtain an AUC of 0.5. Additionally, I have observed that other studies, such as the one on GPSite, report significantly lower performance for ScanNet compared to what you described.

To ensure academic integrity, I hope you can provide proof that your final model was not trained using the test set data. Any additional details on the training and testing processes or other supporting materials would be greatly appreciated.

Thank you for your understanding and cooperation.

jertubiana commented 4 months ago

mooercxx, Accusing someone of lacking scientific integrity, anonymously, is not the right way to ask for help. Comparing the exact AUCPR values with the one reported in other articles is meaningless since the datasets are different. The results were reproduced multiple times using this exact same code, using the python distribution and package versions described in the README. I am sorry to hear that you cannot reproduce them locally, but if you don’t even bother sending me your exact local package versions etc. before formulating these accusations, you are not acting in good faith.

On 12 Jun 2024, at 18:32, mooerccx @.***> wrote:

Over the past few weeks, I have continued to experiment with numerous parameter combinations, but I have been unable to achieve the level of performance reported for your model, only managing to obtain an AUC of 0.5. Additionally, I have observed that other studies, such as the one on GPSite, report significantly lower performance for ScanNet compared to what you described.

To ensure academic integrity, I hope you can provide proof that your final model was not trained using the test set data. Any additional details on the training and testing processes or other supporting materials would be greatly appreciated.

Thank you for your understanding and cooperation.

— Reply to this email directly, view it on GitHub https://github.com/jertubiana/ScanNet/issues/9#issuecomment-2163336296, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACHSPJW7PIMKFJIDZSNCKC3ZHBSYFAVCNFSM6AAAAABHFGGZH2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNRTGMZTMMRZGY. You are receiving this because you commented.

jertubiana / ScanNet

Unable to reproduce the reported AUC score #9