turbomam / biosample-xmldb-sqldb

Tools for loading NCBI Biosample into an XML database and then transforming that into a SQL database
MIT License
0 stars 1 forks source link

add taxonomy data, minimally a list of metagenome ids #30

Closed turbomam closed 8 months ago

turbomam commented 8 months ago
runoak -i ncbitaxon.db info "metagenomes"

NCBITaxon:408169 ! metagenomes

turbomam commented 8 months ago

https://incatools.github.io/ontology-access-kit/cli.html

turbomam commented 8 months ago
runoak -i ncbitaxon.db descendants "metagenomes"  -p i \
    | sed 's/NCBITaxon://' \
    | sed 's/ ! /\t/' > metagenomes.tsv > ncbi_metagenomes.tsv
turbomam commented 8 months ago
CREATE TABLE ncbi_metagenomes (
    taxon_id INTEGER PRIMARY KEY,
    description TEXT
);
turbomam commented 8 months ago

with credentials in a .pgpass file

psql -h localhost -p 15432 -d ncbi_biosamples_feb26  -U postgres

then: \copy ncbi_metagenomes(taxon_id, description) FROM 'ncbi_metagenomes.tsv' WITH (FORMAT csv, DELIMITER E'\t', HEADER false);

turbomam commented 8 months ago

minimal list added