hylasD / tSpace

3 stars 1 forks source link

issue about calculation in tSPACE #7

Open Ruismart opened 5 months ago

Ruismart commented 5 months ago

Hi Denis,

Thanks for the beautiful tool tSPACE, I have been trying to use it to build reasonable trajectories for a bunch of developmental single cell data. Here I have got some issue:

  I first ran a small dataset with about 3k cells and 1.5k variable genes, it took 15h to run on a 64G local PC, the trajectory output seems pretty good.

  then I wanted to run a bigger one with about tens of thosands of cells and same parameters, but it was terminated by me after 100h without an end.

  then I chose to use the top PCs as input, though it could be completed in just a few hours, the tSPACE output result becomes very similar to my old UMAP calculated using the same PCs. It seems like the existing PCs have been determined a lot by custom pre-normalization/-integration. Additionally, if a few datasets have to run individually, it might be hard to keep the consistensy.

So my question is: if there is a way to extract the tPC formula, as getting PCA coefficient from seur.obj@reductions$PCA@feature.loadings ? Then I could run tSPACE on a standard and relatively small dataset at first, then extract the formula for each tPC, after that, I could do the calculation using those pre-built tPC-formulas on any new and bigger datasets with similar celltypes and same pre-normalization.

Kind Wishes,

Shaorui

Ruismart commented 5 months ago

After getting more familiar with the method/code, I think it might be not easy (like, linearly) to label back source-genes/PCs on final tPCs through the distance matrix. I have been trying to think about another way to do the calculation considering pre- feature selection could really make a huge contribute to final trajectories.