waldronlab / SingleCellMultiModal

Single Cell multimodal data scripts for downloading datasets
https://bioconductor.org/packages/SingleCellMultiModal
17 stars 9 forks source link

ontomap should include function_name and DataType #58

Closed LiNk-NY closed 1 year ago

LiNk-NY commented 1 year ago

@drighelli Hi Dario, I think the map should have terms that are also in the package so that users can easily relate the code for particular function to the onto-mapping. For example, it should have a function_name column (added in the current PR) and also a DataType column that corresponds to the DataType argument in each of the functions, if possible. Can you take a look at the map and separate the dataset_name into those values? Can you also double check that the function name corresponds to the right annotation? (esp. for CITEseq, I may have gotten that one wrong). You can read the map in with :

read.delim("inst/extdata/ontomap.tsv")

after checking the branch out.

git branch -d drighelli-ontomap
git checkout -b drighelli-ontomap origin/drighelli-ontomap

PS. I used the same branch name but you may have to delete yours first (as above). Thanks!

drighelli commented 1 year ago

Hi Marcel, @LiNk-NY

I think it's a good idea, I'm gonna reshape the map with DataType and function_name, but we have to point clearly in the documentation/paper that the identifier of each dataset is given by the combination of these two columns together.

About the CITEseq and G&Tseq dataset, unfortunately, we don't have the annotated cell types, so we have to leave the ontomap as it is right now.

Maybe, it could be a good idea to provide "instructions" or similar for extending the ontomap in the future.

LiNk-NY commented 1 year ago

I think it's a good idea, I'm gonna reshape the map with DataType and function_name, but we have to point clearly in the documentation/paper that the identifier of each dataset is given by the combination of these two columns together.

That's okay. We can keep that column in the data as well.

Maybe, it could be a good idea to provide "instructions" or similar for extending the ontomap in the future.

This can be a note in the documentation or perhaps an example script saved in inst/scripts/. I have a template of that already.

drighelli commented 1 year ago

Thanks Marcel!