USPTO / PatentPublicData

Utility tools to help download and parse patent data made available to the public
Other
180 stars 81 forks source link

Fix extraction of continuation relations #101

Closed bschelberg closed 4 years ago

bschelberg commented 4 years ago

Continuation relations were missed because of what looks like a copy and paste error.

Also fixed DocumentId.compareTo() so that DocumentIds with the same date can be added to a TreeSet.

Just as an aside, there doesn't seem to be any way to distinguish between parent and child relations once they've been extracted into the List. I don't know how important that is to others. I've modified the code we're using to use a Relation class which has parent and child document ID fields, so we know which is which.

I don't know if you have a JVM level that you're adhering to. I've used some Java 8 features (just in the test).

bschelberg commented 4 years ago

Also, I don't think that the CONTINUATION_IN_PART and CONTINUATION_REISSUE XPath expressions should have the /PARENT-US part in them.