grimbough / biomaRt

R package providing query functionality to BioMart instances like Ensembl
https://bioconductor.org/packages/biomaRt/
32 stars 13 forks source link

getBM(attributes="minor_allele") returns results "NA"s #101

Open Nerissa-258 opened 3 months ago

Nerissa-258 commented 3 months ago

I am new to biomaRt package and met some problems here.

I tried to transfer rsID to chr_pos_A1_A2 using biomaRt package on RStudio. package version:

> packageVersion("biomaRt")
[1] ‘2.50.3’

And my script looks like this:

library(biomaRt)
library(data.table)
setwd("D:/R/Rproject/biomart/2023/")
listMarts() #avaiable databases
#2024
# biomart                version
# 1 ENSEMBL_MART_ENSEMBL      Ensembl Genes 111
# 2   ENSEMBL_MART_MOUSE      Mouse strains 111
# 3     ENSEMBL_MART_SNP  Ensembl Variation 111
# 4 ENSEMBL_MART_FUNCGEN Ensembl Regulation 111

snp_mart = useMart("ENSEMBL_MART_SNP", dataset="hsapiens_snp")
listDatasets(mart = snp_mart)
# dataset
# 12            hsapiens_snp                    Human Short Variants (SNPs and indels excluding flagged variants) (GRCh38.p14)
# 13        hsapiens_snp_som            Human Somatic Short Variants (SNPs and indels excluding flagged variants) (GRCh38.p14)
# 14      hsapiens_structvar                                                            Human Structural Variants (GRCh38.p14)
# 15  hsapiens_structvar_som                                                    Human Somatic Structural Variants (GRCh38.p14)

raw1<-read.table("B_COV.txt",sep='\t',header=TRUE)
head(raw1)
# GeneID        rsID Lineage Condition        beta     p.value
# 1 ENSG00000166750  rs11080327       B       COV  1.45779163 1.35935e-74
# 2 ENSG00000010030   rs6905318       B       COV  0.90682341 2.20558e-47
# 3 ENSG00000010030   rs2234071       B       COV  0.91898108 5.40314e-48
# 4 ENSG00000196756   rs3752278       B       COV  0.81958042 1.30359e-44
# 5 ENSG00000111639  rs11610774       B       COV -0.09112787 1.66999e-07
# 6 ENSG00000183172 rs112903584       B       COV -0.38784721 2.75289e-46

aq<-getBM(attributes = c("refsnp_id","chr_name","chrom_start","chrom_end","allele","minor_allele","minor_allele_freq"),filters = 'snp_filter', values = snp_id ,mart = snp_mart)

I expect to obtain the minor alleles of these rsID, however, the result returned all 'NA' values in the 'minor_allele' columns.

Looks like:

head(aq)
# refsnp_id chr_name chrom_start chrom_end allele minor_allele minor_allele_freq
# 1  rs225123        1     7987608   7987608  G/A/C           NA                NA
# 2  rs226249        1     7961718   7961718  T/C/G           NA                NA
# 3  rs237802        1   229326054 229326054    G/T           NA                NA
# 4  rs237819        1   229316640 229316640  C/G/T           NA                NA
# 5  rs584158        1   182633925 182633925  T/A/G           NA                NA
# 6  rs663045        1   108200437 108200437  G/A/C           NA                NA

I am really confused by the result and don't know how to fix it.

Could anyone provide some suggestions about this issue? Thanks in advance!!