Closed mxw010 closed 3 years ago
thanks for the report @cherrywang1006
can you share you sessionInfo()
please
Hi,
I replicated this issue at home with a Windows 10.
ncbi_snp_query("rs1610720")
Query Chromosome Marker Class Gene Alleles Major Minor MAF BP AncestralAllele
1 rs1610720 6 rs1610720 snp HCG4 G/T C A 0.3848 29793285 <NA>
And this is my seesion info:
> sessionInfo()
R version 3.5.0 (2018-04-23)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 17134)
Matrix products: default
locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] rsnps_0.2.0
loaded via a namespace (and not attached):
[1] httr_1.3.1 compiler_3.5.0 magrittr_1.5 plyr_1.8.4 R6_2.2.2 tools_3.5.0 curl_3.2
[8] Rcpp_0.12.17 stringi_1.1.7 stringr_1.3.1 XML_3.98-1.11
And I found another SNP with the same problem on a different chr:
ncbi_snp_query("rs2233691")
Query Chromosome Marker Class Gene Alleles Major Minor MAF BP AncestralAllele
1 rs2233691 1 rs2233691 snp PLA2G2E C/T C T 0.3141 19923843 C,C,C,C,C,C
on dbsnp it's listed that C is the minor allele, backed up by 1000 Geomes, but it's listed as major with your package.
thanks @cherrywang1006 -
here's the raw data that comes from NCBI when we use ncbi_snp_query
https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=snp&mode=xml&id=rs1610720
and same for ncbi_snp_query2
https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=snp&retmode=flt&rettype=flt&id=rs1610720
gene's/snps aren't really my area, so I'm not sure if I'm doing something wrong here, any thoughts?
@cherrywang1006 any thoughts?
Hi @sckott
For https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=snp&mode=xml&id=rs1610720 fetching data from recent build (Update build="151" date="2018-01-23 20:15") would be great. buildId="151". The alleles (
For second link https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=snp&retmode=flt&rettype=flt&id=rs1610720 fetching from handle, for example (handle="TOPMED"), and parsing chr_location_Allele1_Allele2 should be good.
Have you given https://api.ncbi.nlm.nih.gov/variation/v0/ a look? I think JSON format in itself is way more accessible and reliable than extracting from XML?
On Fri, Nov 9, 2018 at 3:12 PM Scott Chamberlain notifications@github.com wrote:
@cherrywang1006 https://github.com/cherrywang1006 any thoughts?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ropensci/rsnps/issues/59#issuecomment-437481400, or mute the thread https://github.com/notifications/unsubscribe-auth/AfmX57FjfRsjf2aaZwDh3VX2QuvUEc_cks5uteGngaJpZM4Umq4A .
I haven't. I don't think I was aware of it. Will look at it
Hi @mxw010 , There is a new version of {rsnps} on CRAN that fixes this issue. I will now close this issue, but thank you for bringing it our attention. Cheers, Julia
I was investigating a particular SNP on chr6, and this is what I get
What I got, on two Macs:
Note that alleles from the two queries are different. If you look up the SNP on dbsnp, the alleles are A/G. So, where is C/A coming from? And that begs the question of if there is anything else like this?
Thanks,