JackieHanLab / SenCID

Senescent Cell Identification
MIT License
21 stars 1 forks source link

Inconsistent AUC results for demo #13

Open Hira8023 opened 1 month ago

Hira8023 commented 1 month ago

Hello, very useful tool! I learned the SenCID following the demo/SenCID_tutorial.ipynb. When I ran it more than one times, I can't get the consistant AUC values. I seted adata.obs['condition'] as the ground true label and calculated the AUC with the function roc_auc_score().

Y_test=adata.obs['condition'].astype(int) pb_results = adata.obs['SID_Score'] auroc_score = roc_auc_score(Y_test, pb_results)

The auroc_score is 0.972083 or 0.982708.

Is it a randomness step inside the model? How can I set it up to get consistent results?

SherryMiyano commented 3 weeks ago

Hi, thanks for the support of SenCID!

The randomness results from the imputation of single cell data, which uses an autoencoder model DCA. Generally, it would not cause large difference between two rounds of the same data. Since there was no setups in that dependency to keep the result constant (I myself tried to set seed for that but failed), it would not do for SenCID. But you may keep the intermediate result from DCA by setting parameter: 'savetmp = True', which will save the imputed matrix for any need for reproduction.

Another thing to be noted is that, for bulk RNA-seq data, there's no need to impute the matrix, so the results from bulk RNA-seq is always consistent by turning off DCA imputation (by setting parameter denoising = False).