Generate small anndata file for testing JSON --> Anndata scripts

dosumis commented 11 months ago

Options:

Make small slice of Siletti file that we can use with current test examples pulled from Siletti.
Generate a slice using CellxGene Census.

dosumis commented 11 months ago

Suggested test

Here is some very simple JSON to flatten:

{
  "author": "Kimberly Siletti",
  "labelset": [{
       "cellannotation_setname": "supercluster_term",
       "cell_label": "Vascular",
       "cell_fullname": "vascular cell",
       "cell_type": "endothelial cell of vascular tree",
       "cell_type_ontology_term_id": "CL:0002139",
       "marker_gene_evidence": ["CLDN5",  "ACTA2"]
   },
    {
      "cellannotation_setname": "Subtype auto-annotation",
       "cell_label": "VENOUS",
       "cell_fullname": "Vein endothelial cell",
       "cell_type": "vein endothelial cell",
       "cell_type_ontology_term_id": "CL:0002543",
       "marker_gene_evidence": ["PECAM1", "ACKR1", "IL1R1"]
    }
  ]
}

To generate an AnnData file:

Download this file: https://cellxgene.cziscience.com/e/b165f033-9dec-468a-9248-802fc6902a74.cxg/

cell ids for obs["supercluster-term"] == "Vascular" cell ids for obs["cluster_id"] == 17 => cell IDs for "cellannotation_setname": "Subtype auto-annotation", "cell_label": "VENOUS",

To make a usably small matrix create a new anndata file by slicing to select on cells with obs["supercluster-term"] == "Vascular"

evanbiederstedt commented 11 months ago

Note that ScanPy provides example datasets:

import scanpy as sc
adata = sc.datasets.pbmc68k_reduced()

@ubyndr @hkir-dev

cellannotation / cell-annotation-schema

Generate small anndata file for testing JSON --> Anndata scripts #24