pluto-the-lost / unicoord

BSD 3-Clause "New" or "Revised" License
1 stars 1 forks source link

Some queries regarding the paper #2

Open shobhitagrawal1 opened 1 week ago

shobhitagrawal1 commented 1 week ago

Congratulations on the paper and thank you for this exciting tool. I would be highly obliged if you could answer some queries. I would like to use the program for my own data but I wanted to replicate some of your workflows in the paper:

  1. Model:The link to the model trained appear dysfunctional - no data is found following the link.
  2. Which cells/data are included in the model: Could you provide the xls table for the data included so that I can find out if the model might just work for my data.
  3. Reprex: Is there a reproducibility repository or do you have some notebooks for the workflows you used in the paper (specifically I am interesting in Fig3 and Fig4 (Interpolating timestamps or spatial coordinates to fill gaps and Pre-training UniCoord model with cell atlas data for analyzing disease-related cells) which I would to follow for my own data.
  4. Model parameter explanation: some help regarding the model parameters e.g. genes_used, n_cont, n_diff, n_clus, min_obs Thank you once again, shobhit
pluto-the-lost commented 5 days ago

Dear Shobhit,

Thank you for your kind words and for showing interest in our work! I’m happy to address your queries:

  1. Dysfunctional Model Link: I’ve checked the original link and found it to be non-functional. To resolve this, I have uploaded all the models, along with the data used for training, to Figshare. You can access them here: https://figshare.com/account/projects/229041/articles/27896874. This should also address your second query.

  2. Reproducibility Repository: All the code required to reproduce the results from the paper is available in the GitHub repository. Please let me know if you encounter any difficulties while navigating or using the code.

  3. Using Gene x Cells Matrix: We have experimented with both highly variable genes (HVG) and the entire gene matrix. Based on our findings, we recommend using HVG for better performance. However, ensure that your data is transformed to a log scale before proceeding.

I hope this helps! Feel free to reach out if you have additional questions or need further assistance with the workflows. Wishing you all the best with your research!

Best regards, Haoxiang