USPTO / PatentPublicData

Utility tools to help download and parse patent data made available to the public
Other
186 stars 80 forks source link

Patent applications name inconsistent #53

Open patricknee opened 7 years ago

patricknee commented 7 years ago

In recent application files (2011, for example) US patent applications are named with the format Country+Year+Number+Type (e.g., US20110002889A1).

In recent grant files (2010, for example), citations of US patent applications are (often? always?) named with the format Country+Year+/+Number+Type (e.g., US2011/0002889A1).

(I qualify as "recent" only because I have not looked further back.)

At least within the US Patent Corpus (applications + citations), referential integrity should be maintained.

Not sure whether this is a bug or an enhancement.

bgfeldm commented 5 years ago

Over time improvements have been made. And agree with your comment about referential integrity. Though for patent law, data integrity is of high importance (lawyers want to see the exact original) so the public XML maintains the integrity as filed by the applicant or law firm. I try to do my best to correct what I can.