Currently, the SNP-Gene association method uses strict co-occurrence. This works fine on abstracts, but might lead to problems on fulltexts.
For instance the substitution "p.R450H" in PMID 22662265 is normalized to the following two dbSNP-ID's.
rs202162624 (Gene BLK / Entrez 640) (wrong)
rs144700339 (Gene ARAP1 / Entrez 116985) (correct)
The gene BLK is mentioned several sections before the SNP, whereas the ARAP1 mention is in close proximity.
Currently, the SNP-Gene association method uses strict co-occurrence. This works fine on abstracts, but might lead to problems on fulltexts.
For instance the substitution "p.R450H" in PMID 22662265 is normalized to the following two dbSNP-ID's. rs202162624 (Gene BLK / Entrez 640) (wrong) rs144700339 (Gene ARAP1 / Entrez 116985) (correct)
The gene BLK is mentioned several sections before the SNP, whereas the ARAP1 mention is in close proximity.
Therefore, it would be beneficial to use SNP proximity for SNP-gene association http://bc3.informatik.hu-berlin.de/view/fulltext/22662265