cellannotation / cell-annotation-schema

General, open-standard schema for cell annotations
11 stars 2 forks source link

Generate small anndata file for testing JSON --> Anndata scripts #24

Closed dosumis closed 8 months ago

dosumis commented 11 months ago

Options:

dosumis commented 11 months ago

Suggested test

Here is some very simple JSON to flatten:

{
  "author": "Kimberly Siletti",
  "labelset": [{
       "cellannotation_setname": "supercluster_term",
       "cell_label": "Vascular",
       "cell_fullname": "vascular cell",
       "cell_type": "endothelial cell of vascular tree",
       "cell_type_ontology_term_id": "CL:0002139",
       "marker_gene_evidence": ["CLDN5",  "ACTA2"]
   },
    {
      "cellannotation_setname": "Subtype auto-annotation",
       "cell_label": "VENOUS",
       "cell_fullname": "Vein endothelial cell",
       "cell_type": "vein endothelial cell",
       "cell_type_ontology_term_id": "CL:0002543",
       "marker_gene_evidence": ["PECAM1", "ACKR1", "IL1R1"]
    }
  ]
}

To generate an AnnData file:

cell ids for obs["supercluster-term"] == "Vascular" cell ids for obs["cluster_id"] == 17 => cell IDs for "cellannotation_setname": "Subtype auto-annotation", "cell_label": "VENOUS",

To make a usably small matrix create a new anndata file by slicing to select on cells with obs["supercluster-term"] == "Vascular"

evanbiederstedt commented 11 months ago

Note that ScanPy provides example datasets:

import scanpy as sc
adata = sc.datasets.pbmc68k_reduced() 

@ubyndr @hkir-dev