Closed laironald closed 11 years ago
So it's an awkward parse, but the error itself is because it's a duplicate entry?
I'll look into fixing the parse. Is the error an issue here?
This seems like a prevalent issue:
sqlite> select * from class where SubClass like '%MER%' limit 10;
6863030|0|200|2,MERHOWINDUSTRIES-ADD.
6863031|0|200|2,MERHOWINDUSTRIES-ADD.
6863032|0|200|2,MERHOWINDUSTRIES-ADD.
6863033|0|200|2,MERHOWINDUSTRIES-ADD.
6863034|0|200|2,MERHOWINDUSTRIES-ADD.
6863035|0|200|2,MERHOWINDUSTRIES-ADD.
6863036|0|200|2,MERHOWINDUSTRIES-ADD.
6863037|0|200|2,MERHOWINDUSTRIES-ADD.
6863038|0|200|2,MERHOWINDUSTRIES-ADD.
6863039|0|200|2,MERHOWINDUSTRIES-ADD.
sqlite> select count(*) from class where SubClass like '%MER%';
5213
Based on the strings, it looks like some of the <othercit>
tag contents are being pulled into classes. Still looking into it.
cool. yeah that other file is suspiciously wrong i would say. :D
On Thu, Jul 11, 2013 at 3:54 PM, Gabe Fierro notifications@github.comwrote:
Based on the strings, it looks like some of the
tag contents are being pulled into classes. Still looking into it. — Reply to this email directly or view it on GitHubhttps://github.com/funginstitute/patentprocessor/issues/27#issuecomment-20848386 .
sent from mobile
When the parser asks for the
The files above appear to have parsing issues. Here is an error that shows up:
This is an awkward parse for many reasons. IE. 200/2,MERHOW etc doesn't look like a subclass_id key.