Something to identify dataset - currently h5ad link
Author cell type fields
In future:
author cell type field present : T/F (update SOP - there should be a row with blank author cell type field(s)
To deal with version changes, need CxG Link
Dataset (individual datasets within larger group):
Description: The specific name of the dataset being curated within a larger dataset group.
Example: "Single cell transcriptional and chromatin accessibility profiling redefine cellular heterogeneity in the adult human kidney - ATACseq"
Full name dataset (top of page):
Description: The full descriptive name of the dataset that should be used for documentation and display.
Example: "Single cell transcriptional and chromatin accessibility profiling redefine cellular heterogeneity in the adult human kidney"
Curators need something readable to work with to know what they've curated
Ugur needs some specific key to look up. It may also be useful to have 2 keys, one human readable one not, to cross check. Right now he is using the h5ad link only. h5ad link is sensitive to version and might change.
A record of what's been curated (although this can also be generated from reports)
Loading dataset where there is no author cell type category but there is CL annotation
Not needed:
DOI not needed as can get it from CxG link
study short name
CxG Dataset Collection <-- keep because helpful in debugging
h5ad should be the latest version which may not be the same as the CxG dataset link.
content
Suggestion:
Remove 'content' column and include only 'author cell type field' column. Having another column with only the entry 'cell type' would be pointless.
Ugur needs only:
In future: author cell type field present : T/F (update SOP - there should be a row with blank author cell type field(s) To deal with version changes, need CxG Link
Editors need
Details https://github.com/Cellular-Semantics/CL_KG/blob/main/docs/dataset_curation_guidelines.md
1. DataSet identification:
We have 7 fields:
Do we need them all?
Use cases:
Not needed:
h5ad should be the latest version which may not be the same as the CxG dataset link.
content
Suggestion: