raphael-group / GASTON

BSD 3-Clause "New" or "Revised" License
16 stars 4 forks source link

Could GASTON calculate gradient along assigned direction? #7

Closed jkfo002 closed 3 months ago

jkfo002 commented 3 months ago

Hi,

Thank you for developing this amazing tools.

When i used this tools, I first calculated the GLM-PCs and results showed a gradient at the direction which i interested in (but not the first PC, it's the fourth PC). However, after training the model and selecting the best model, the streamlines didn't show the gradient as i wanted.

Could it's possible to get the gradient with the PCs i interested in or identified the gene fitting with the assigned direction ?

uthsavc commented 3 months ago

Thanks for your interest in our work!

One limitation of GASTON is that we assume there is a single gradient field that can describe all genes/PCs. But this assumption may not hold for some tissues. In this case, we suggest training the model on a subset of the genes/PCs.

In your case, you may want to apply the model to only the 4th PC and a few other PCs. For example, if you only want to use PCs 4/5/6, then you could save the PCs as

np.save('cerebellum_data/glmpca.npy', A[:, [4, 5, 6])

It might also be helpful to restrict your tissue to only a smaller region in space that has the gradient you are interested in.

jkfo002 commented 3 months ago

Thanks for your quick answer! I also want to ask another question. Does it make sense to fit the gene expression with a specific PC with test like GWR (Geographically Weighted Regression) to get the genes most align the gradient I interested in?

uthsavc commented 3 months ago

Sure. Once you have a coordinate d (eg the PC4, or the isodepth) that models your gradient, then you can do any regression to find genes g that are correlated with the coordinate d. I think normal linear regression would work fine - I am not sure if you specifically need geographically weighted regression, since the coordinate d will already vary across space.

That said, we often find that PCs or other coordinates don't vary smoothly across space, compared to the GASTON isodepth, and may sometimes "hide" expression gradients. For example, in an analysis of a MERFISH tissue (see below; this figure will be in our manuscript once it is published in 1-2 months), we found that the "ENVI pseudo-depth", which computes a diffusion component (DC), does not show gradients in Acta2 or Chn2 while GASTON isodepth does.

kohei_merfish.pdf

jkfo002 commented 3 months ago

Awesome! That's really helpful for me, thanks again for kindly help!