doraadong / UNIFAN

Unsupervised cell functional annotation for single-cell RNA-Seq
MIT License
22 stars 3 forks source link

Inconsistent results with example data #2

Closed auesro closed 2 years ago

auesro commented 2 years ago

Dear authors, I have been testing UNIFAN with the example data. The pipeline runs without problems however the end result differs from the one shown in the notebook: the skeletal muscle satellite cell population is split in two groups as you can see below Figure 2022-07-05 093936 Figure 2022-07-05 093943

Is this relevant?

Also, it would be really helpful if you could provide the code necessary to obtain the information represented in the figures 2D and 2E of the Genome Research article.

Thanks!

doraadong commented 2 years ago

Hi, thanks for the questions! Regarding your first question, there are some variations between different runs and also, as we noted in the tutorial, the number of epochs we used for training in the tutorial is much fewer than the default values. I would recommend you trying with the default number of epochs.

As for your second question, we are working on the tutorials on model interpretation and visualization. Briefly, we make interpretations based on the coefficients learned by the annotator (classifier), which are stored in the state dictionary named classifier_state_dict and with field name "decoder.predictor.0.weight". Each coefficient value for each class (cluster) corresponds to a gene set / gene. The names for the features (gene set / genes) can be found in the saved file input_r_names_path.

Let us know if you find any other questions.

auesro commented 2 years ago

Hi, Thanks for your reply. I will repeat the analysis to account for the variation but expected to get "similar" results to the ones in the notebook (running with the same number of epochs).

Regarding my second question, I have found the file input_r_names_path but I dont have any variable named classifier_state_dict or similar after running the notebook in spyder: image

Any ideas?

doraadong commented 2 years ago

Hi, sorry for the confusion! The actual file name is indeed not "classifier_state_dict ". Given we trained the cluster part and the annotator (classifier) together, we saved their state dicts in a single file named: annoclustermodel{number of epochs - 1}.pickle. Within this pickle file, you can find the field name 'state_dict_2' for the state dict of the annotator / classifier. And the coefficients are saved with name "decoder.predictor.0.weight".

Again, thank you for pointing out this! We will keep you posted about the tutorial on model interpretation.

auesro commented 2 years ago

Thank you @doraadong! I found both datasets. Lets see if I can replicate some of your plots. Looking forward to the tutorial!

doraadong commented 2 years ago

Hi @auesro, the tutorial on cluster annotation is just uploaded. Sorry for the delay! Let me know if you find questions.