liaoherui / GDmicro

GDmicro - Use GCN and Deep adaptation network to classify host disease status based on human gut microbiome data.
https://academic.oup.com/bioinformatics/article/39/12/btad747/7470738
MIT License
7 stars 0 forks source link

Evaluate with test sample #4

Open HyunSBong opened 2 weeks ago

HyunSBong commented 2 weeks ago

Hi. GDmicro is really impressive, thanks for your remarkable job. I want to check the performance with the final test data I have. Is it correct to set the class column value of my final test data to "test" and run with the example described in 1.1 of readme? It's my first time in the field of domain adaptation, so I'm careful, but I wonder if this will cause data leakage.

liaoherui commented 2 weeks ago

Hi, thanks for using GDmicro!

Yes. You can use the tool like that. Currently, we have changed some codes/functions to make the tool be more useful in identifying microbial biomarkers rather than classification. Thus, the performance may be influenced (may be decreased) but the biomarker discovery function should work better. Domain adaptation mainly used test data (no label info is required) to learn the latent features, so it will not cause data leakage.

HyunSBong commented 2 weeks ago

thanks!