Open ergonyc opened 4 months ago
We ought to compile the CellRanger QC metrics into tabular form and pass along as part of the metadata. Potentially also some of the auxilary CellBender metrics.
TODO: enumerate which metrics
I think we need to add a "pre-filtered" summary of metadata and possibly an unfiltered "merged" adata object. i.e the output of plot_qc_metrics.py. That will offer users an easy entry point for alternative processing / filtering. e.g. if the user wanted to develop an "atlas" of cells / cell-types for a specific brain region, they would harmonize a subset of the overall cohort.
We should also upate scvi-tools to v1.15 and python to 3.12 at some point. Maybe after the next release... maybe with the other updates / testing
There's a subtle bug in the cellassign code which estimates the library size parameter after subsetting to the marker genes. This should be calculated before subsetting.
Looks like grouping the v2.0 release changes into these categories will be useful:
Final code changes:
samples
rather than batch_id
#70 also updated in opening comment.
@kfang4
Here's a summary of the artifacts that should be curated...
Added artifacts:
entry point 1
${cohort_id}.merged_adata_object.h5ad
feature metadata
scvi model
${cohort_id}_scvi_model.tar.gz
entry point 2
asap-dev-data-{cohort,team-xxyy}
├── cohort_analysis
│ ├── ${cohort_id}.sample_list.tsv
│ ├── ${cohort_id}.initial_metadata.csv
│ ├── ${cohort_id}.merged_adata_object.h5ad
│ ├── ${cohort_id}.doublet_score.violin.png
│ ├── ${cohort_id}.n_genes_by_counts.violin.png
│ ├── ${cohort_id}.pct_counts_mt.violin.png
│ ├── ${cohort_id}.pct_counts_rb.violin.png
│ ├── ${cohort_id}.total_counts.violin.png
│ ├── ${cohort_id}.final_validation_metrics.csv
│ ├── ${cohort_id}.all_genes.csv
│ ├── ${cohort_id}.hvg_genes.csv
│ ├── ${cohort_id}_scvi_model.tar.gz
│ ├── ${cohort_id}.cell_types.csv
│ ├── ${cohort_id}.final_adata.h5ad
│ ├── ${cohort_id}.final_metadata.csv
│ ├── ${cohort_id}.scib_report.csv
│ ├── ${cohort_id}.scib_results.svg
│ ├── ${cohort_id}.features.umap.png
│ ├── ${cohort_id}.groups.umap.png
│ └── MANIFEST.tsv
For the August release of ASAP CRN datasets several small updates have been made to the PMBDS workflows.
IMPLIMENTED
WIP
Bug fixes
Curation
Enhancements
Upgrades
Unresolved. Pushed to v3.0
curated artifacts summary
added artifacts:
entry point 1
${cohort_id}.merged_adata_object.h5ad
feature metadata
scvi model
${cohort_id}_scvi_model.tar.gz
entry point 2