grimbough / biomaRt

R package providing query functionality to BioMart instances like Ensembl
https://bioconductor.org/packages/biomaRt/
35 stars 13 forks source link

Bacterial genomes mart #88

Closed phisanti closed 1 month ago

phisanti commented 1 year ago

I was wondering if it is possible to use biomaRtto get bacterial genome information. Currently, I have the following error:

# For animal genomes

standard_mart <- useMart(biomart = "ensembl", host = "https://www.ensembl.org")
standard_datasets <- listDatasets(standard_mart)
print(head(standard_datasets))

                       dataset                           description     version
1 abrachyrhynchus_gene_ensembl Pink-footed goose genes (ASM259213v1) ASM259213v1
2     acalliptera_gene_ensembl      Eastern happy genes (fAstCal1.2)  fAstCal1.2
3   acarolinensis_gene_ensembl       Green anole genes (AnoCar2.0v2) AnoCar2.0v2
4    acchrysaetos_gene_ensembl       Golden eagle genes (bAquChr1.2)  bAquChr1.2
5    acitrinellus_gene_ensembl        Midas cichlid genes (Midas_v5)    Midas_v5
6    amelanoleuca_gene_ensembl       Giant panda genes (ASM200744v2) ASM200744v2

# For bacterial genomes

bacteria_mart <- useMart(biomart = "bacteria_mart", host = "https://bacteria.ensembl.org")
bacteria_datasets <- listDatasets(bacteria_mart)
print(bacteria_datasets)

Error in bmRequest(request = request, httr_config = httr_config, verbose = verbose) : 
  Not Found (HTTP 404).
grimbough commented 1 month ago

Unfortunately the BioMart service does not scale well to the volume of bacterial genomes in Ensembl, and as such no BioMart interface is provided for Ensembl Bacteria. There's a few more details available at https://support.bioconductor.org/p/82585/#82586