OSU-BMBL / scDEAL

Deep Transfer Learning of Drug Sensitivity by Integrating Bulk and Single-cell RNA-seq data
Apache License 2.0
46 stars 11 forks source link

The result of testing the article data again is still bad. #9

Open SZ-qing opened 1 year ago

SZ-qing commented 1 year ago

Hello, Based on your latest code, data and provided parameters, I tested the results in the paper again using the environment you provided and found that the results are still very bad For the Data6 GSE110894, the result for AUC is 0.91, is ok For the Data5 GSE149383 the AUC is image the F1 score is image For the Data3 GSE112274 the AUC is image the F1 score is image

In your paper, above data model's power is so strong, image

It's really hard for me to imagine whether your model is robust.

SZ-qing commented 1 year ago

@juychen @OSU-BMBL-admin

SZ-qing commented 1 year ago

The result adata file name is image

SZ-qing commented 1 year ago

For the data4 GSE140440, the AUC is image the f1 score is image

juychen commented 1 year ago

For the data4 GSE140440, the AUC is image the f1 score is image

Did you try to remove the previous whole folder and download that latest scDEAL and run it again?

SZ-qing commented 1 year ago

For the data4 GSE140440, the AUC is image the f1 score is image

Did you try to remove the previous whole folder and download that latest scDEAL and run it again?

Yes , I created a new directory and then all was done according to the latest version

SZ-qing commented 1 year ago

For the data4 GSE140440, the AUC is image the f1 score is image

Did you try to remove the previous whole folder and download that latest scDEAL and run it again?

Yes , I created a new directory and then all was done according to the latest version

By the way, did you evaluate the results under the latest version?

juychen commented 1 year ago
We have tested the code on two different computing clusters.What are the results of the rest two datasets?  From: nierqSent: 2023年3月29日 10:02To: OSU-BMBL/scDEALCc: Junyi; MentionSubject: Re: [OSU-BMBL/scDEAL] The result of testing the article data again is still bad. (Issue #9) For the data4 GSE140440, the AUC is the f1 score is Did you try to remove the previous whole folder and download that latest scDEAL and run it again?Yes , I created a new directory and then all was done according to the latest versionBy the way, did you evaluate the results under the latest version?—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: ***@***.***> 
SZ-qing commented 1 year ago
We have tested the code on two different computing clusters.What are the results of the rest two datasets?  From: nierqSent: 2023年3月29日 10:02To: OSU-BMBL/scDEALCc: Junyi; MentionSubject: Re: [OSU-BMBL/scDEAL] The result of testing the article data again is still bad. (Issue #9) For the data4 GSE140440, the AUC is the f1 score is Did you try to remove the previous whole folder and download that latest scDEAL and run it again?Yes , I created a new directory and then all was done according to the latest versionBy the way, did you evaluate the results under the latest version?—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: ***@***.***> I have tested four datasets in six datasets, and the remaining two have not been tested yet, so can you provide performance of your test results? Such as AUC score F1 score
juychen commented 1 year ago

We have tested the code on two different computing clusters.What are the results of the rest two datasets?  From: nierqSent: 2023年3月29日 10:02To: OSU-BMBL/scDEALCc: Junyi; MentionSubject: Re: [OSU-BMBL/scDEAL] The result of testing the article data again is still bad. (Issue #9) For the data4 GSE140440, the AUC is the f1 score is Did you try to remove the previous whole folder and download that latest scDEAL and run it again?Yes , I created a new directory and then all was done according to the latest versionBy the way, did you evaluate the results under the latest version?—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: @.***>

I have tested four datasets in six datasets, and the remaining two have not been tested yet, so can you provide performance of your test results? Such as AUC score F1 score

Did you follow the instruction that begins with loading from the checkpoints? That mode generated all the scores we generated

SZ-qing commented 1 year ago

Let me upload the shell script that I tested. For the data4 GSE140440: python bulkmodel.py --drug "DOCETAXEL" --dimreduce "DAE" --encoder_h_dims "256,128" --predictor_h_dims "256,128" --bottleneck 512 --data_name "GSE140440" --sampling "upsampling" --dropout 0.1 --lr 0.01 --printgene "F" -mod "new" --checkpoint "False" python scmodel.py --sc_data "GSE140440" --dimreduce "DAE" --drug "DOCETAXEL" --bulk_h_dims "256,128" --bottleneck 512 --predictor_h_dims "256,128" --dropout 0.1 --printgene "F" -mod "new" --lr 0.01 --sampling "upsampling" --printgene "F" -mod "new" --checkpoint "False"

For the data5 GSE149383 python bulkmodel.py --drug "ERLOTINIB" --dimreduce "DAE" --encoder_h_dims "512,256" --predictor_h_dims "256,128" --bottleneck 64 --data_name "GSE149383" --sampling "upsampling" --dropout 0.3 --lr 0.01 --printgene "F" -mod "new" --checkpoint "False" python scmodel.py --sc_data "GSE149383" --dimreduce "DAE" --drug "ERLOTINIB" --bulk_h_dims "512,256" --bottleneck 64 --predictor_h_dims "256,128" --dropout 0.3 --printgene "F" -mod "new" --lr 0.01 --sampling "upsampling" --printgene "F" -mod "new" --checkpoint "False"

For the data3 GSE112274: python bulkmodel.py --drug "GEFITINIB" --dimreduce "DAE" --encoder_h_dims "512,256" --predictor_h_dims "256,128" --bottleneck 256 --data_name "GSE112274" --sampling "no" --dropout 0.1 --lr 0.5 --printgene "F" -mod "new" --checkpoint "False" python scmodel.py --sc_data "GSE112274" --dimreduce "DAE" --drug "GEFITINIB" --bulk_h_dims "512,256" --bottleneck 256 --predictor_h_dims "256,128" --dropout 0.1 --printgene "F" -mod "new" --lr 0.5 --sampling "no" --printgene "F" -mod "new" --checkpoint "False"

These parameter settings are set according to the trained parameters provided by you. I don't understand why the same code, data and parameters have different results. On the other hand, I get the same results as you according to checkpoint, why I can't get the same parameters with the same parameters.

juychen commented 1 year ago

Let me upload the shell script that I tested. For the data4 GSE140440: python bulkmodel.py --drug "DOCETAXEL" --dimreduce "DAE" --encoder_h_dims "256,128" --predictor_h_dims "256,128" --bottleneck 512 --data_name "GSE140440" --sampling "upsampling" --dropout 0.1 --lr 0.01 --printgene "F" -mod "new" --checkpoint "False" python scmodel.py --sc_data "GSE140440" --dimreduce "DAE" --drug "DOCETAXEL" --bulk_h_dims "256,128" --bottleneck 512 --predictor_h_dims "256,128" --dropout 0.1 --printgene "F" -mod "new" --lr 0.01 --sampling "upsampling" --printgene "F" -mod "new" --checkpoint "False"

For the data5 GSE149383 python bulkmodel.py --drug "ERLOTINIB" --dimreduce "DAE" --encoder_h_dims "512,256" --predictor_h_dims "256,128" --bottleneck 64 --data_name "GSE149383" --sampling "upsampling" --dropout 0.3 --lr 0.01 --printgene "F" -mod "new" --checkpoint "False" python scmodel.py --sc_data "GSE149383" --dimreduce "DAE" --drug "ERLOTINIB" --bulk_h_dims "512,256" --bottleneck 64 --predictor_h_dims "256,128" --dropout 0.3 --printgene "F" -mod "new" --lr 0.01 --sampling "upsampling" --printgene "F" -mod "new" --checkpoint "False"

For the data3 GSE112274: python bulkmodel.py --drug "GEFITINIB" --dimreduce "DAE" --encoder_h_dims "512,256" --predictor_h_dims "256,128" --bottleneck 256 --data_name "GSE112274" --sampling "no" --dropout 0.1 --lr 0.5 --printgene "F" -mod "new" --checkpoint "False" python scmodel.py --sc_data "GSE112274" --dimreduce "DAE" --drug "GEFITINIB" --bulk_h_dims "512,256" --bottleneck 256 --predictor_h_dims "256,128" --dropout 0.1 --printgene "F" -mod "new" --lr 0.5 --sampling "no" --printgene "F" -mod "new" --checkpoint "False"

These parameter settings are set according to the trained parameters provided by you. I don't understand why the same code, data and parameters have different results. On the other hand, I get the same results as you according to checkpoint, why I can't get the same parameters with the same parameters.

Hi, did you install and activate the conda environment? Such as source scDEALenv/bin/activate

SZ-qing commented 1 year ago

Let me upload the shell script that I tested. For the data4 GSE140440: python bulkmodel.py --drug "DOCETAXEL" --dimreduce "DAE" --encoder_h_dims "256,128" --predictor_h_dims "256,128" --bottleneck 512 --data_name "GSE140440" --sampling "upsampling" --dropout 0.1 --lr 0.01 --printgene "F" -mod "new" --checkpoint "False" python scmodel.py --sc_data "GSE140440" --dimreduce "DAE" --drug "DOCETAXEL" --bulk_h_dims "256,128" --bottleneck 512 --predictor_h_dims "256,128" --dropout 0.1 --printgene "F" -mod "new" --lr 0.01 --sampling "upsampling" --printgene "F" -mod "new" --checkpoint "False" For the data5 GSE149383 python bulkmodel.py --drug "ERLOTINIB" --dimreduce "DAE" --encoder_h_dims "512,256" --predictor_h_dims "256,128" --bottleneck 64 --data_name "GSE149383" --sampling "upsampling" --dropout 0.3 --lr 0.01 --printgene "F" -mod "new" --checkpoint "False" python scmodel.py --sc_data "GSE149383" --dimreduce "DAE" --drug "ERLOTINIB" --bulk_h_dims "512,256" --bottleneck 64 --predictor_h_dims "256,128" --dropout 0.3 --printgene "F" -mod "new" --lr 0.01 --sampling "upsampling" --printgene "F" -mod "new" --checkpoint "False" For the data3 GSE112274: python bulkmodel.py --drug "GEFITINIB" --dimreduce "DAE" --encoder_h_dims "512,256" --predictor_h_dims "256,128" --bottleneck 256 --data_name "GSE112274" --sampling "no" --dropout 0.1 --lr 0.5 --printgene "F" -mod "new" --checkpoint "False" python scmodel.py --sc_data "GSE112274" --dimreduce "DAE" --drug "GEFITINIB" --bulk_h_dims "512,256" --bottleneck 256 --predictor_h_dims "256,128" --dropout 0.1 --printgene "F" -mod "new" --lr 0.5 --sampling "no" --printgene "F" -mod "new" --checkpoint "False" These parameter settings are set according to the trained parameters provided by you. I don't understand why the same code, data and parameters have different results. On the other hand, I get the same results as you according to checkpoint, why I can't get the same parameters with the same parameters.

Hi, did you install and activate the conda environment? Such as source scDEALenv/bin/activate

Of course, I use the environment you provide, scDEALenv/bin/activate

SZ-qing commented 1 year ago

You can see that there is no problem with my parameter settings above. I just don't understand why the same parameters can't get the same result. You are an expert in this area and hope to get some answers.

SZ-qing commented 1 year ago

@juychen Based on all the current data, can you use the parameter settings of the paper to reproduce a good result? Instead of using the trained model from the old environment

juychen commented 1 year ago

@juychen Based on all the current data, can you use the parameter settings of the paper to reproduce a good result? Instead of using the trained model from the old environment

We have tested the current parameter on the GitHub page, code, and the conda environment listed and reproduced the results on two different Linux-based computing clusters. Checkpoints of those tests are provided. To further investigate your issue I may need to discuss this with my colleague.

SZ-qing commented 1 year ago

@juychen Based on all the current data, can you use the parameter settings of the paper to reproduce a good result? Instead of using the trained model from the old environment

We have tested the current parameter on the GitHub page, code, and the conda environment listed and reproduced the results on two different Linux-based computing clusters. Checkpoints of those tests are provided. To further investigate your issue I may need to discuss this with my colleague.

Looking forward to receiving your reply again, because the same parameter settings, data, code and environment do not get consistent results, which is very disturbing, and I should not be the only one who has these problems.

juychen commented 1 year ago

@juychen Based on all the current data, can you use the parameter settings of the paper to reproduce a good result? Instead of using the trained model from the old environment

We have tested the current parameter on the GitHub page, code, and the conda environment listed and reproduced the results on two different Linux-based computing clusters. Checkpoints of those tests are provided. To further investigate your issue I may need to discuss this with my colleague.

Looking forward to receiving your reply again, because the same parameter settings, data, code and environment do not get consistent results, which is very disturbing, and I should not be the only one who has these problems.

Sorry for not concluding your issue at the moment. We have made intensive efforts to present results we generated.

SZ-qing commented 1 year ago

@juychen Based on all the current data, can you use the parameter settings of the paper to reproduce a good result? Instead of using the trained model from the old environment

We have tested the current parameter on the GitHub page, code, and the conda environment listed and reproduced the results on two different Linux-based computing clusters. Checkpoints of those tests are provided. To further investigate your issue I may need to discuss this with my colleague.

Looking forward to receiving your reply again, because the same parameter settings, data, code and environment do not get consistent results, which is very disturbing, and I should not be the only one who has these problems.

Sorry for not concluding your issue at the moment. We have made intensive efforts to present results we generated.

Kudos to you for your problem solving efforts, I did my replication process exactly as you provided it, just without using your trained model.

SZ-qing commented 1 year ago

Hello, Can you send me the code and data email for your local tests? I can't replicate your results using github's resources, and I can't get good results on single-cell data either using your recommended parameters or my own combination of training parameters. @juychen

juychen commented 1 year ago

Hello, Can you send me the code and data email for your local tests? I can't replicate your results using github's resources, and I can't get good results on single-cell data either using your recommended parameters or my own combination of training parameters. @juychen

What is your torch version? What other code and data do you need? They should be the same on GitHub. It may need a while to retrieve shell scripts because I may be busy at the moment.

SZ-qing commented 1 year ago

Hello, Can you send me the code and data email for your local tests? I can't replicate your results using github's resources, and I can't get good results on single-cell data either using your recommended parameters or my own combination of training parameters. @juychen

What is your torch version? What other code and data do you need? They should be the same on GitHub. It may need a while to retrieve shell scripts because I may be busy at the moment.

image

SZ-qing commented 1 year ago

Hello, Can you send me the code and data email for your local tests? I can't replicate your results using github's resources, and I can't get good results on single-cell data either using your recommended parameters or my own combination of training parameters. @juychen

What is your torch version? What other code and data do you need? They should be the same on GitHub. It may need a while to retrieve shell scripts because I may be busy at the moment.

I need all the code for the model you tested on your computer cluster. Please send all the code and the shell script for tuning the reference in your free time to the email address:findbugs2023@gmail.com

SZ-qing commented 1 year ago

The same parameter settings, the same number of cpu, I found that the new version of the program runs very much slower than the old version, are tested in the latest environment provided, please ask the new version is what has changed? Please allow me to interrupt, as there are indeed many inconsistencies in reproducing the results of your article. The biological logic of your published article on scAD tools is great.

SZ-qing commented 1 year ago

@juychen

SZ-qing commented 1 year ago

If you re-adjust the parameter in the new environment, I think your shell script should be ready, why do not give the parameter script for a long time. I conducted two independent tests according to your parameter combination, and found that the model power makes people suspect that your model results are over-fitted. image @ @juychen In your latest article scAD model, a similar deep transfer learning model is used. The performance of the model in the new article is far inferior to your scDEAL model. Why don't you guys use good models in new articles? For the academic rigorous attitude, I have to say skeptical words, please forgive me.

SZ-qing commented 1 year ago

Maybe there are some deviations in my understanding of your articles and models.

LenisLin commented 1 year ago

Hi @SZ-qing, thank you for the detailed testing! I'm interested in this research area and would like to benchmark this method. Could you please share the final testing results for scDEAL? Is it reproducible? Thanks in advance!

SZ-qing commented 1 year ago

Hi @SZ-qing, thank you for the detailed testing! I'm interested in this research area and would like to benchmark this method. Could you please share the final testing results for scDEAL? Is it reproducible? Thanks in advance!

Hi, If you use checkpoint the results are ok, but if you use the same parameters to train the model yourself the results are not very good.

LenisLin commented 1 year ago

Thank you for your valuable insights! In the context of benchmarking, I tend to retrain the model, particularly given the size of the dataset used in this model. It should not be too difficult to train. However, your results have inspired me to approach this with greater rigor.

Git-zhaohui commented 7 months ago

I've encountered the same issue and am unable to replicate the results presented in the paper.

Git-zhaohui commented 7 months ago

Have you managed to solve it?

Git-zhaohui commented 7 months ago

@SZ-qing

SZ-qing commented 7 months ago

Have you managed to solve it?

I've given up on that.

LCGaoZzz commented 5 months ago

Have you managed to solve it?

@Git-zhaohui Having the same problem, the results in the article are confusing...😶

Git-zhaohui commented 5 months ago

I've given up on that. @LCGaoZzz

seekning commented 4 months ago

Have you managed to solve it?你设法解决了吗?

I've given up on that. 我已经放弃了。

do you have wechat orQQ?i want to talk to u about this model,it is so confused

juychen commented 4 months ago

Have you managed to solve it?

@Git-zhaohui Having the same problem, the results in the article are confusing...😶

Have you managed to solve it?你设法解决了吗?

I've given up on that. 我已经放弃了。

do you have wechat orQQ?i want to talk to u about this model,it is so confused

We have tried our best to ensure the reproducibility of the result by providing the model checkpoints. You may try to load checkpoints and fine-tune them instead of training from scratch.

yulijia commented 2 months ago

For the data4 GSE140440: python bulkmodel.py --drug "DOCETAXEL" --dimreduce "DAE" --encoder_h_dims "256,128" --predictor_h_dims "256,128" --bottleneck 512 --data_name "GSE140440" --sampling "upsampling" --dropout 0.1 --lr 0.01 --printgene "F" -mod "new" --checkpoint "False" python scmodel.py --sc_data "GSE140440" --dimreduce "DAE" --drug "DOCETAXEL" --bulk_h_dims "256,128" --bottleneck 512 --predictor_h_dims "256,128" --dropout 0.1 --printgene "F" -mod "new" --lr 0.01 --sampling "upsampling" --printgene "F" -mod "new" --checkpoint "False"

For the data5 GSE149383 python bulkmodel.py --drug "ERLOTINIB" --dimreduce "DAE" --encoder_h_dims "512,256" --predictor_h_dims "256,128" --bottleneck 64 --data_name "GSE149383" --sampling "upsampling" --dropout 0.3 --lr 0.01 --printgene "F" -mod "new" --checkpoint "False" python scmodel.py --sc_data "GSE149383" --dimreduce "DAE" --drug "ERLOTINIB" --bulk_h_dims "512,256" --bottleneck 64 --predictor_h_dims "256,128" --dropout 0.3 --printgene "F" -mod "new" --lr 0.01 --sampling "upsampling" --printgene "F" -mod "new" --checkpoint "False"

For the data3 GSE112274: python bulkmodel.py --drug "GEFITINIB" --dimreduce "DAE" --encoder_h_dims "512,256" --predictor_h_dims "256,128" --bottleneck 256 --data_name "GSE112274" --sampling "no" --dropout 0.1 --lr 0.5 --printgene "F" -mod "new" --checkpoint "False" python scmodel.py --sc_data "GSE112274" --dimreduce "DAE" --drug "GEFITINIB" --bulk_h_dims "512,256" --bottleneck 256 --predictor_h_dims "256,128" --dropout 0.1 --printgene "F" -mod "new" --lr 0.5 --sampling "no" --printgene "F" -mod "new" --checkpoint "False"

I'm not sure if my reproduction procedure is correct. I activated the downloaded env and deleted all files in the save folder. Then, I recreated the subfolders and ran data 3,4,5 using the command line that @SZ-qing used. Afterthat, I loaded the adata file into R and calculated the AUC using the code below.

Data 5 AUC is 0.7887.

ad <- anndata::read_h5ad('GSE149383integrate_data_GSE149383_drug_ERLOTINIB_bottle_64_edim_512,256_pdim_256,128_model_DAE_dropout_0.3_gene_F_lr_0.01_mod_new_sam_upsampling.h5ad')
gt=ad$obs[,"sensitive"]
sens_label=ad$obs[,"sens_label"]
roc_object= pROC::roc(gt, as.numeric(sens_label) ,levels = c(0, 1), direction = "<")
roc_object$auc
Area under the curve: 0.7887

For data 4 is 0.858

> ad <- anndata::read_h5ad('GSE140440integrate_data_GSE140440_drug_DOCETAXEL_bottle_512_edim_256,128_pdim_256,128_model_DAE_dropout_0.1_gene_F_lr_0.01_mod_new_sam_upsampling.h5ad')
gt=ad$obs[,"sensitive"]
sens_label=ad$obs[,"sens_label"]
roc_object= pROC::roc(gt, as.numeric(sens_label) ,levels = c(0, 1), direction = "<")
roc_object$auc
Area under the curve: 0.858

Data 3 is 0.4968

> ad <- anndata::read_h5ad('GSE112274integrate_data_GSE112274_drug_GEFITINIB_bottle_256_edim_512,256_pdim_256,128_model_DAE_dropout_0.1_gene_F_lr_0.5_mod_new_sam_no.h5ad')
gt=ad$obs[,"sensitive"]
sens_label=ad$obs[,"sens_label"]
roc_object= pROC::roc(gt, as.numeric(sens_label) ,levels = c(0, 1), direction = "<")
roc_object$auc
Area under the curve: 0.4968

If anyone has done the same tests using the environment provided in the repo, please post the result here. I think it could help the authors and other users to figure out the reason for the differences when reproducing the result by training the model from scratch.