massonix / HCATonsilData

Provide programmatic access to the tonsil cell atlas datasets
Other
8 stars 2 forks source link

TODO (2022/03/23) #1

Closed massonix closed 2 years ago

massonix commented 2 years ago
  1. Change names of the files to upload: change "_" in cell types names (ILC_NK, CD8_T, CD4_T, NBCMBC) to "". In this way, we can easily subset the cell type name given any file name.
  2. Ensure I can easily create a SingleCellExperiment from the independent slots. Check in particular the "processed" slot, which might be corrupted (epithelial).
  3. Confirm valid metadata. As described here: "When you are satisfied with the representation of your resources in your metadata.csv (or other aptly named csv file) the Bioconductor team member will add the metadata to the production database. Confirm the metadata csv files in inst/extdata/ are valid by by running either ExperimentHubData::makeExperimentHubMetadata()". This can even become a unit test inside testthat.
  4. Send mail to email to hubs@bioconductor.org. Ask them to check that everythng is looking good. Ask for the SAS token, which we'll need to upload the data to experimentHub via AzureStor.
  5. Once the data lives in ExperimentHub, create functions (HCATonsilData(datatset, cell_type)) to access the data and retrieve a SingleCellExperiment object. Write vignettes to document how this is done, and include a description of all the data modalities.
  6. Generate iSEE instances for every cell type with pre-configured settings: (1) run iSEE(sce), (2) configure the panels to display the most relevant info, (3) click the "download" button to copy and paste the initial code, (4) save said code.
  7. Discuss with Will the best strategy to host the iSEE shiny apps in the web.
  8. Set up a meeting with @federicomarini to discuss next steps (iSEE, SLOcatoR)
federicomarini commented 2 years ago

Once the data lives in ExperimentHub, create functions (HCATonsilData(datatset, cell_type)) to access the data and retrieve a SingleCellExperiment object. Write vignettes to document how this is done, and include a description of all the data modalities.

This might deserve some extra thought in the long run on how the API should be retrieved. For the time being, even if this includes now one omics type only, it might be meaningful to keep an assay_type parameter, defaulting e.g. to "RNA", and then the function internally handles what needs to be done.

Generate iSEE instances for every cell type with pre-configured settings: (1) run iSEE(sce), (2) configure the panels to display the most relevant info, (3) click the "download" button to copy and paste the initial code, (4) save said code.

To get a slim startup, you can also use iSEEu's function, modeEmpty() on the same sce object. That is pretty much the app running but no single panel.

As for the setup of app folders, files, and all that, ping me again if you need inspiration from the setup I had for the covid_IT portal.