Closed hiraksarkar closed 2 years ago
Hi @hiraksarkar ,
thanks for mentioning! Please find the directory for data preparation here.
I added the respective notebook for the MERFISH - brain dataset.
We moved data preparation to notebooks, as this is currently only required for the MERFISH - brain dataset. For the remaining datasets no preprocessing was applied. Happy to link you to the original datasets or assist further if needed.
I left a comment on the Merfish dataset front in the tutorial issue (It seems without the CSV we can not run NCEM unfortunately). For other datasets it's not entirely clear to me which files should be downloaded and how to provide them to NCEM. Apologies if there is already a tutorial that I have missed.
For example, in case of "CODEX cancer" data, the actual paper leads to a dataset directory. It contains around 2 TB of processed images. Should I download that ? Given that directory I am not sure how do I call NCEM on that. The merfish data should be more straightforward given I can obtain the metadata.csv
file.
Thanks
Hi @hiraksarkar, I added more detailed instructions to the README of ncem_tutorials and ncem_benchmarks on how to access the public datasets.
For the CODEX cancer dataset, only the single-cell data is required to test ncem, which is stored in a different directory with 213 MB. So no need to download the 2TB of images.
To run tutorials or data exploration, simply store the files in a directory of your choice and adjust the datadir
in the respective notebooks. We stored datasets in folders named by first author. If you follow a different convention, also adjust data_path
whenever ncem's get_data
function is called.
Thanks for your comments, we are still enhancing the usability of ncem, so any feedback if highly appreciated.
Hi @AnnaChristina amazing, really appreciate the help, trying this now. I have some questions about the manuscript, should I just email them if that's possible. Again, thanks a lot for helping.
Sure! Please feel free to raise additional issues whenever needed. Yes, just email me and @davidsebfischer. Happy to discuss and answer!
Hi @AnnaChristina ,
Just wanna mention the dataloader assumes scMEP_MIBI_singlecell.csv
is inside a directory scMEP_MIBI_singlecell
and not the zipped file.
Except that it is working fine, although I am using same variable values as mentioned for Zheng tutorial.
Hi @hiraksarkar,
Yes. The dataloader expects certain folder structures. We will enhance this in the future.
Great. You can basically run the tutorial for each dataset with similar parameters. For more explanation on how to set specific parameters (e.g. radius), you can check the data exploration notebooks and the manuscript.
Hi,
I am not able to find the directory
data_preparation
as described in README. It would be really helpful if it can be added.Thanks