iobis / mgnify-extract

1 stars 0 forks source link

Process FASTA and Biom downloads #2

Closed pieterprovoost closed 1 year ago

pieterprovoost commented 2 years ago

Example analysis MGYA00199483, assuming these are the files required:

reads name format url
Taxonomic analysis SSU rRNA Reads encoding SSU rRNA FASTA https://www.ebi.ac.uk/metagenomics/api/v1/analyses/MGYA00199483/file/ERR654519_MERGED_FASTQ_SSU.fasta.gz
Taxonomic analysis SSU rRNA Reads encoding SSU rRNA TSV https://www.ebi.ac.uk/metagenomics/api/v1/analyses/MGYA00199483/file/ERR654519_MERGED_FASTQ_SSU_OTU.tsv
Taxonomic analysis SSU rRNA OTUs, counts and taxonomic assignments for SSU rRNA JSON Biom https://www.ebi.ac.uk/metagenomics/api/v1/analyses/MGYA00199483/file/ERR654519_MERGED_FASTQ_SSU_OTU_TABLE_JSON.biom
Taxonomic analysis LSU rRNA Reads encoding LSU rRNA FASTA https://www.ebi.ac.uk/metagenomics/api/v1/analyses/MGYA00199483/file/ERR654519_MERGED_FASTQ_LSU.fasta.gz
Taxonomic analysis LSU rRNA Reads encoding LSU rRNA TSV https://www.ebi.ac.uk/metagenomics/api/v1/analyses/MGYA00199483/file/ERR654519_MERGED_FASTQ_LSU_OTU.tsv
Taxonomic analysis LSU rRNA OTUs, counts and taxonomic assignments for SSU rRNA JSON Biom https://www.ebi.ac.uk/metagenomics/api/v1/analyses/MGYA00199483/file/ERR654519_MERGED_FASTQ_LSU_OTU_TABLE_JSON.biom

This is what the API response looks like (filtered to files above):

Click to expand! ```json { "links": { "first": "https://www.ebi.ac.uk/metagenomics/api/v1/analyses/MGYA00199483/downloads?format=json&page=1", "last": "https://www.ebi.ac.uk/metagenomics/api/v1/analyses/MGYA00199483/downloads?format=json&page=1", "next": null, "prev": null }, "data": [ { "type": "analysis-job-downloads", "id": "ERR654519_MERGED_FASTQ_SSU.fasta.gz", "attributes": { "alias": "ERR654519_MERGED_FASTQ_SSU.fasta.gz", "file-format": { "name": "FASTA", "extension": "fasta", "compression": true }, "description": { "label": "Reads encoding SSU rRNA", "description": "All reads encoding SSU rRNA" }, "group-type": "Taxonomic analysis SSU rRNA", "file-checksum": { "checksum": "", "checksum-algorithm": "" } }, "relationships": { "pipeline": { "data": { "type": "pipelines", "id": "4.1" }, "links": { "related": "https://www.ebi.ac.uk/metagenomics/api/v1/pipelines/4.1?format=json" } } }, "links": { "self": "https://www.ebi.ac.uk/metagenomics/api/v1/analyses/MGYA00199483/file/ERR654519_MERGED_FASTQ_SSU.fasta.gz?format=json" } }, { "type": "analysis-job-downloads", "id": "ERR654519_MERGED_FASTQ_SSU_OTU.tsv", "attributes": { "alias": "ERR654519_MERGED_FASTQ_SSU_OTU.tsv", "file-format": { "name": "TSV", "extension": "tsv", "compression": false }, "description": { "label": "Reads encoding SSU rRNA", "description": "All reads encoding SSU rRNA" }, "group-type": "Taxonomic analysis SSU rRNA", "file-checksum": { "checksum": "", "checksum-algorithm": "" } }, "relationships": { "pipeline": { "data": { "type": "pipelines", "id": "4.1" }, "links": { "related": "https://www.ebi.ac.uk/metagenomics/api/v1/pipelines/4.1?format=json" } } }, "links": { "self": "https://www.ebi.ac.uk/metagenomics/api/v1/analyses/MGYA00199483/file/ERR654519_MERGED_FASTQ_SSU_OTU.tsv?format=json" } }, { "type": "analysis-job-downloads", "id": "ERR654519_MERGED_FASTQ_SSU_OTU_TABLE_JSON.biom", "attributes": { "alias": "ERR654519_MERGED_FASTQ_SSU_OTU_TABLE_JSON.biom", "file-format": { "name": "JSON Biom", "extension": "biom", "compression": false }, "description": { "label": "OTUs, counts and taxonomic assignments for SSU rRNA", "description": "OTUs and taxonomic assignments for SSU rRNA" }, "group-type": "Taxonomic analysis SSU rRNA", "file-checksum": { "checksum": "", "checksum-algorithm": "" } }, "relationships": { "pipeline": { "data": { "type": "pipelines", "id": "4.1" }, "links": { "related": "https://www.ebi.ac.uk/metagenomics/api/v1/pipelines/4.1?format=json" } } }, "links": { "self": "https://www.ebi.ac.uk/metagenomics/api/v1/analyses/MGYA00199483/file/ERR654519_MERGED_FASTQ_SSU_OTU_TABLE_JSON.biom?format=json" } }, { "type": "analysis-job-downloads", "id": "ERR654519_MERGED_FASTQ_LSU.fasta.gz", "attributes": { "alias": "ERR654519_MERGED_FASTQ_LSU.fasta.gz", "file-format": { "name": "FASTA", "extension": "fasta", "compression": true }, "description": { "label": "Reads encoding LSU rRNA", "description": "All reads encoding LSU rRNA" }, "group-type": "Taxonomic analysis LSU rRNA", "file-checksum": { "checksum": "", "checksum-algorithm": "" } }, "relationships": { "pipeline": { "data": { "type": "pipelines", "id": "4.1" }, "links": { "related": "https://www.ebi.ac.uk/metagenomics/api/v1/pipelines/4.1?format=json" } } }, "links": { "self": "https://www.ebi.ac.uk/metagenomics/api/v1/analyses/MGYA00199483/file/ERR654519_MERGED_FASTQ_LSU.fasta.gz?format=json" } }, { "type": "analysis-job-downloads", "id": "ERR654519_MERGED_FASTQ_LSU_OTU.tsv", "attributes": { "alias": "ERR654519_MERGED_FASTQ_LSU_OTU.tsv", "file-format": { "name": "TSV", "extension": "tsv", "compression": false }, "description": { "label": "Reads encoding LSU rRNA", "description": "All reads encoding LSU rRNA" }, "group-type": "Taxonomic analysis LSU rRNA", "file-checksum": { "checksum": "", "checksum-algorithm": "" } }, "relationships": { "pipeline": { "data": { "type": "pipelines", "id": "4.1" }, "links": { "related": "https://www.ebi.ac.uk/metagenomics/api/v1/pipelines/4.1?format=json" } } }, "links": { "self": "https://www.ebi.ac.uk/metagenomics/api/v1/analyses/MGYA00199483/file/ERR654519_MERGED_FASTQ_LSU_OTU.tsv?format=json" } }, { "type": "analysis-job-downloads", "id": "ERR654519_MERGED_FASTQ_LSU_OTU_TABLE_JSON.biom", "attributes": { "alias": "ERR654519_MERGED_FASTQ_LSU_OTU_TABLE_JSON.biom", "file-format": { "name": "JSON Biom", "extension": "biom", "compression": false }, "description": { "label": "OTUs, counts and taxonomic assignments for LSU rRNA", "description": "OTUs and taxonomic assignments for LSU rRNA" }, "group-type": "Taxonomic analysis LSU rRNA", "file-checksum": { "checksum": "", "checksum-algorithm": "" } }, "relationships": { "pipeline": { "data": { "type": "pipelines", "id": "4.1" }, "links": { "related": "https://www.ebi.ac.uk/metagenomics/api/v1/pipelines/4.1?format=json" } } }, "links": { "self": "https://www.ebi.ac.uk/metagenomics/api/v1/analyses/MGYA00199483/file/ERR654519_MERGED_FASTQ_LSU_OTU_TABLE_JSON.biom?format=json" } }, ], "meta": { "pagination": { "page": 1, "pages": 1, "count": 22 } } } ```
pieterprovoost commented 2 years ago

GBIF provided example: