knmnyn / ParsCit

An open-source CRF Reference String Parsing Package
http://wing.comp.nus.edu.sg/parsCit
GNU Lesser General Public License v3.0
155 stars 47 forks source link

ParsCit errors on citation markers like [3-5] #3

Closed adibaba closed 13 years ago

adibaba commented 13 years ago

Hello again,

I got two errors while using ParsCit:

/opt/parscitgit/bin/citeExtract.pl -m extract_all 2007_fulltext-4.raw-nopgbrk-ascii.txt
scalar(Fig. 2. Dimensions of error modes [3, 4, 5]) != scalar(2242 2243 2244 2245 2246 2247 2248 2248)
Fig. 2. Dimensions of error modes [3, 4, 5]
/opt/parscitgit/bin/citeExtract.pl -m extract_all prodeedings_12928.raw-nopgbrk-ascii.txt
scalar(be explored. The content management structure including pedagogical approaches, has been described in [8, 9]) != scalar(886 887 888 889 890 891 892 893 894 895 896 897 898 899)
be explored. The content management structure including pedagogical approaches, has been described in [8, 9]

The respective plain text parts are:

Fig. 2. Dimensions of error modes [3-5]
has been described in [8-9] and includes among other things

It seems like citations including several references are replaced from [x-z] to [x, y, z] and afterwards there are some problems.

If you are interested in the full plain texts or the PDF files, you can send me a short message to info[REMOVE]@adrianwilke.de

Best regards, Adrian

knmnyn commented 13 years ago

Hi Adrian:

I don't think we currently have a fix for this problem yet. ParsCit doesn't currently interpolate the ranges into individual numbers to match versus the identified reference strings. Our development priority isn't on this at the moment, so it's not likely that we will fix it soon. We'll keep an eye out on this to see whether we can schedule a fix for this in the future. Thanks!