ASAP-CRN / pmdbs-sc-rnaseq-wf

Repo for testing and developing a common postmortem-derived brain sequencing (PMDBS) workflow harmonized across ASAP
Apache License 2.0
1 stars 1 forks source link

v.2.0 dataset release - wf updates #57

Open ergonyc opened 4 months ago

ergonyc commented 4 months ago

For the August release of ASAP CRN datasets several small updates have been made to the PMBDS workflows.

IMPLIMENTED

WIP

Bug fixes

Curation

Enhancements

Upgrades

Unresolved. Pushed to v3.0

curated artifacts summary

added artifacts:

entry point 1

scvi model

asap-dev-data-{cohort,team-xxyy}
├── cohort_analysis
│   ├── ${cohort_id}.sample_list.tsv
│   ├── ${cohort_id}.initial_metadata.csv
│   ├── ${cohort_id}.merged_adata_object.h5ad
│   ├── ${cohort_id}.doublet_score.violin.png
│   ├── ${cohort_id}.n_genes_by_counts.violin.png
│   ├── ${cohort_id}.pct_counts_mt.violin.png
│   ├── ${cohort_id}.pct_counts_rb.violin.png
│   ├── ${cohort_id}.total_counts.violin.png
│   ├── ${cohort_id}.final_validation_metrics.csv
│   ├── ${cohort_id}.all_genes.csv
│   ├── ${cohort_id}.hvg_genes.csv
│   ├── ${cohort_id}_scvi_model.tar.gz
│   ├── ${cohort_id}.cell_types.csv
│   ├── ${cohort_id}.final_adata.h5ad
│   ├── ${cohort_id}.final_metadata.csv
│   ├── ${cohort_id}.scib_report.csv
│   ├── ${cohort_id}.scib_results.svg
│   ├── ${cohort_id}.features.umap.png
│   ├── ${cohort_id}.groups.umap.png
│   └── MANIFEST.tsv
ergonyc commented 4 months ago

We ought to compile the CellRanger QC metrics into tabular form and pass along as part of the metadata. Potentially also some of the auxilary CellBender metrics.

TODO: enumerate which metrics

ergonyc commented 4 months ago

I think we need to add a "pre-filtered" summary of metadata and possibly an unfiltered "merged" adata object. i.e the output of plot_qc_metrics.py. That will offer users an easy entry point for alternative processing / filtering. e.g. if the user wanted to develop an "atlas" of cells / cell-types for a specific brain region, they would harmonize a subset of the overall cohort.

ergonyc commented 4 months ago

We should also upate scvi-tools to v1.15 and python to 3.12 at some point. Maybe after the next release... maybe with the other updates / testing

ergonyc commented 4 months ago

There's a subtle bug in the cellassign code which estimates the library size parameter after subsetting to the marker genes. This should be calculated before subsetting.

ergonyc commented 1 month ago

Looks like grouping the v2.0 release changes into these categories will be useful:

Bug fixes

Curation

Upgrades

ergonyc commented 3 weeks ago

Final code changes:

also updated in opening comment.

ergonyc commented 2 weeks ago

@kfang4

Here's a summary of the artifacts that should be curated...

curated artifacts summary

Added artifacts:

entry point 1

scvi model

asap-dev-data-{cohort,team-xxyy}
├── cohort_analysis
│   ├── ${cohort_id}.sample_list.tsv
│   ├── ${cohort_id}.initial_metadata.csv
│   ├── ${cohort_id}.merged_adata_object.h5ad
│   ├── ${cohort_id}.doublet_score.violin.png
│   ├── ${cohort_id}.n_genes_by_counts.violin.png
│   ├── ${cohort_id}.pct_counts_mt.violin.png
│   ├── ${cohort_id}.pct_counts_rb.violin.png
│   ├── ${cohort_id}.total_counts.violin.png
│   ├── ${cohort_id}.final_validation_metrics.csv
│   ├── ${cohort_id}.all_genes.csv
│   ├── ${cohort_id}.hvg_genes.csv
│   ├── ${cohort_id}_scvi_model.tar.gz
│   ├── ${cohort_id}.cell_types.csv
│   ├── ${cohort_id}.final_adata.h5ad
│   ├── ${cohort_id}.final_metadata.csv
│   ├── ${cohort_id}.scib_report.csv
│   ├── ${cohort_id}.scib_results.svg
│   ├── ${cohort_id}.features.umap.png
│   ├── ${cohort_id}.groups.umap.png
│   └── MANIFEST.tsv