Closed spenagon closed 7 years ago
Thanks for reporting. I added some tests into the project, which can be found here: https://github.com/rockt/SETH/blob/master/src/test/java/de/hu/berlin/wbi/issues/Issue17Test.java
I also added a Unit-test with rsId's containing a preceeding colon.
We have two issues with incorrect extraction and no extracted rsIds:
docId: 17678724 " Two polymorphisms in MHC2TA gene (rs4,774G/C and rs3,087,456A/G) were studied in two groups" extracted: rs4 and rs3
docId: 22419714 "Patients carrying the TCF7L2_rs7903146_T allele had an increased risk of CRC (P(trend) = 0.02), whereas patients harboring the IL13_rs20541_T allele had a reduced risk (P(trend) = 0.02)"
also we saw some cases of rsIds preceded of ":"