Open DiracZhu1998 opened 5 months ago
In addition, I checked but couldn't find any relevant code and parameter usage related to the frog-zebrafish integration in your paper. The Jupyter notebook you provided is not the version you generated for the paper, the graph and integration in your paper are great but frog-zebrafish with default parameters is not that good.
The jupyter notebook is the version used in the paper (same hyperparameters, the random seed will be slightly different but this shouldn't make a hude difference), how were the results different?
How are you judging how well the species are integrated? You should try transferring labels between species and measuring accuracy.
Hi Yanay, thank you for your quick response! Probably you are right, I just compared them with naked eye. so not that accurate but looks quite different from your paper. I assume that the same major clusters (cell types) from different species should be close to each other rather than separate. I also tested for human and mouse whole brain atlas, It also doesn't integrated well.
I checked about distance between your generated protein embeddings and mine, the corresponding genes had the lowest distance so no problem with the step of protein embedding. The problem seems to be related to the scRNA and snRNA datasets, once I removed the snRNA human dataset and only integrated mouse and lizard (both are scRNA datasets), they integrated much better than before. I was wondering do you have some recommendations to give more "force" on integration to make snRNA human better integrate with other scRNA datasets, for example, maybe increasing the pretrain numbers? Many thanks!
The UMAP for frog and zebrafish looks pretty similar. The actual UMAP will not be the exact same because of random seed and different hardware/versions. Another aspect that is different is that in the UMAP you generated, it looks like points are smaller/are in a random order, which causes the clusters to appear more mixed, which can be hard to see.
Dear authors,
Thank you for giving us such a wonderful toolkit! If I want to build an evolutionary distance atlas, should we use major cell type level as "cell_type" label? Is there any other parameters you would recommend we could tune to make the atlas better since the default results doesn't integrated well.
Thank you for your help!
Best wishes, Yuanzhen