microbiomedata / nmdc-schema

National Microbiome Data Collaborative (NMDC) unified data model
https://microbiomedata.github.io/nmdc-schema/
Creative Commons Zero v1.0 Universal
27 stars 8 forks source link

mixed QC tools and targets #1948

Open turbomam opened 3 months ago

turbomam commented 3 months ago
turbomam commented 3 months ago

looks like I had probably written some yq for listing the slots in a schema file

wget https://raw.githubusercontent.com/GenomicsStandardsConsortium/mixs/v6.2.0/src/mixs/schema/mixs.yam
yq e '.slots | keys' mixs.yaml | sed 's/^- //' | sort > mixs.6.2.slots.txt
turbomam commented 3 months ago

also did some analysis of used mixs slots. might have obtained used-73-mixs-slots.csv from a SPARQL query?

awk -F',' 'NR>1 {print $2}' used-73-mixs-slots.csv | sort > used-73-mixs-slot-names.txt
turbomam commented 3 months ago

and obtained the sizes of files like nmdc_mga0rre721_centrifuge_classification.tsv

from perlmutter.nersc.gov:/global/cfs/cdirs/m3408/results/nmdc:mga0rre721/ReadbasedAnalysis