willtownes / nsf-paper

Nonnegative spatial factorization for multivariate count data
GNU Lesser General Public License v3.0
51 stars 11 forks source link

Enquiry about the data in your repo #4

Closed Yecats77 closed 1 year ago

Yecats77 commented 1 year ago

Excuse me. May I ask where I can get the data you used in your codes? For example, data, other than h5ad files, as you indicated, is provided by email from you. Could I get a copy of them?

willtownes commented 1 year ago

Please refer to the information here: https://www.nature.com/articles/s41592-022-01687-w#data-availability For anything that isn't linked directly, you would need to contact the authors of the original study to request the data.

Yecats77 commented 1 year ago

Thanks for your reply!

Sorry that I think I had a misunderstanding before, and I can obtain the data from your data_loading scripts now.

Besides, I have a further question. After your data_loading, we can obtain two h5ad files, for example, the sshippo.h5ad and the sshippo_J2000.h5ad. Then I use them as input to run the demo.ipynb to get the spatial factors. Is this the correct usage of your demo?

willtownes commented 1 year ago

The data_loading scripts are for the real data. demo.ipynb is with simulated toy data. To further analyze the sshippo data, refer to the 03_benchmark.ipy and 04_exploratory.ipy scripts. This is a large dataset, you might want to try playing around with the smaller visium data instead as it is more amenable to exploratoration. The sshippo.h5ad file has all genes whereas the sshippo_J2000.h5ad file only has the top 2,000 most variable genes according to the Poisson deviance criterion.