QIFEIDKN / STAGATE

Adaptive Graph Attention Auto-encoder for Spatial Domain Identification of Spatial Transcriptomics
MIT License
35 stars 18 forks source link

The `151676` cluster result in paper cannot be reproduced! My reproducation tf result is `0.51` with tuning and pytorch result is `0.29`. #10

Closed forechoandlook closed 2 years ago

forechoandlook commented 2 years ago

The reproducation link is https://colab.research.google.com/drive/1BmH34x0e8hGo5SD57xet0tJLxfZLL1S-. We use the repo code and try our best to get a great result, but we still can not reproduce the paper result ARI 0.60 in 151676. Is there any problem with repo ? And the result gap between tf repo and pyG repo is too large. Dose it have any error in pyG repo?

QIFEIDKN commented 2 years ago

Thanks for your interest in our approach. For the first question, limited by the tensorflow 1 flamework, the performance of STAGATE will be affected by randomness. We discuss the effect of randomness on the clustering accuracy in Fig. S2e of the paper.

Briefly, for the 151676 sample, the ARI score may be 0.6, 0.58, 0.51 or 0.42 according to different random seeds, and it will get different result even using the same random seed.

image

QIFEIDKN commented 2 years ago

For the second question, the main difference between STAGAET_tf and STAGATE_pyG is the non-linear function for attention weight calculation. Sigmoid function is used in the tensorflow version, and LeakyRelu funcition is used in the pyG version. We found that, in most cases, the performance of STAGATE_pyG is similar to (even better sometimes) STAGAET_tf version. For the DLPFC dataset, the clustering accuracy of STAGATE_pyG are slightly better than STAGATE_tf on average across 12 sections.

However, we did find that the performance of STAGATE_pyG is poor on 151676 section, so we provided the Sigmoid version of STAGATE_pyG (https://github.com/QIFEIDKN/STAGATE_pyG-Sigmoid). We can get the ARI=0.6 in this version.

forechoandlook commented 2 years ago

@QIFEIDKN Thank you for your reply! We test the new repo and get the greate(0.60 in 151676) performance. And the same time, we test STAGATE result with other dataset, and all get greate performance. For more convenience, we share the reproduction jupyter on colab for other users. https://colab.research.google.com/drive/1YjiPag6P1Hf9-_Vd421if5dshkUrdkEG?usp=sharing

threebanli commented 1 year ago

@QIFEIDKN Thank you for your reply! We test the new repo and get the greate(0.60 in 151676) performance. And the same time, we test STAGATE result with other dataset, and all get greate performance. For more convenience, we share the reproduction jupyter on colab for other users. https://colab.research.google.com/drive/1YjiPag6P1Hf9-_Vd421if5dshkUrdkEG?usp=sharing

Hello, I ran the code you shared on colab, but it shows that even if I run the code with GPU under Colab Pro subscription, it shows that I can't train the full 1000 times (CUDA memory overflow at 800 times), did you modify the training file and in what running mode (CPU/GPU/TPU) did you get the results? Thank you for reading this question @forechoandlook