Closed auesro closed 2 years ago
Hi, thanks for the questions! Regarding your first question, there are some variations between different runs and also, as we noted in the tutorial, the number of epochs we used for training in the tutorial is much fewer than the default values. I would recommend you trying with the default number of epochs.
As for your second question, we are working on the tutorials on model interpretation and visualization. Briefly, we make interpretations based on the coefficients learned by the annotator (classifier), which are stored in the state dictionary named classifier_state_dict and with field name "decoder.predictor.0.weight". Each coefficient value for each class (cluster) corresponds to a gene set / gene. The names for the features (gene set / genes) can be found in the saved file input_r_names_path.
Let us know if you find any other questions.
Hi, Thanks for your reply. I will repeat the analysis to account for the variation but expected to get "similar" results to the ones in the notebook (running with the same number of epochs).
Regarding my second question, I have found the file input_r_names_path
but I dont have any variable named classifier_state_dict
or similar after running the notebook in spyder:
Any ideas?
Hi, sorry for the confusion! The actual file name is indeed not "classifier_state_dict ". Given we trained the cluster part and the annotator (classifier) together, we saved their state dicts in a single file named: annoclustermodel{number of epochs - 1}.pickle. Within this pickle file, you can find the field name 'state_dict_2' for the state dict of the annotator / classifier. And the coefficients are saved with name "decoder.predictor.0.weight".
Again, thank you for pointing out this! We will keep you posted about the tutorial on model interpretation.
Thank you @doraadong! I found both datasets. Lets see if I can replicate some of your plots. Looking forward to the tutorial!
Hi @auesro, the tutorial on cluster annotation is just uploaded. Sorry for the delay! Let me know if you find questions.
Dear authors, I have been testing UNIFAN with the example data. The pipeline runs without problems however the end result differs from the one shown in the notebook: the skeletal muscle satellite cell population is split in two groups as you can see below
Is this relevant?
Also, it would be really helpful if you could provide the code necessary to obtain the information represented in the figures 2D and 2E of the Genome Research article.
Thanks!