USPTO / PatentPublicData

Utility tools to help download and parse patent data made available to the public
Other
179 stars 81 forks source link

WIPO patent headers = "WOWO" #52

Open patricknee opened 7 years ago

patricknee commented 7 years ago

Question:

After processing by TransformerCli, in the citations section WIPO patents seem to come through with a "text"="WOWO xxxxxxx". According to USPTO documentation, the correct header appears to be "WO" (see https://www.uspto.gov/patents-application-process/applying-online/country-codes-wipo-st3-table).

Is this "WOWO" header correct, or an error? Also, given that most other country codes are adjacent to the numbers, is the space correct?

Example: US Patent: US9234358B2

"citations": [....
            {
              "num": "00050",
              "text": "WOWO 94/03095A1",
              "citedBy": "PATCIT",
              "examinerCited": false,
              "type": "PATENT"
            },
....
bgfeldm commented 7 years ago

As of this point my prototypes have not done much with foreign citations. But this does seem to occur quite often, in the data. As of right now foreign patent numbers are the way they appear in the data, with the addition to the country code. So this problems seems to occur when the patent number already has the country code, thus its repeated when the country code is added. Different countries use different numbering schemes, and I just haven't got around to researching the best way to represent them. In this case Google patents translates the number to "WO1994003095A1"

patricknee commented 7 years ago

Thanks, that's helpful. I'll decide how I want to deal with the issue for now.

bgfeldm commented 7 years ago

WIPO Standard ST.6 Publication Number Formatting; published December 2002 http://www.wipo.int/export/sites/www/standards/en/pdf/03-06-01.pdf

WIPO Standard ST.13, Numbering of applications for IPRs; published Febuary 2008 http://www.wipo.int/export/sites/www/standards/en/pdf/03-13-01.pdf

http://www.wipo.int/export/sites/www/standards/en/pdf/03-14-01.pdf

bgfeldm commented 7 years ago

Patent Document Id Variations (after quick review, likely more)

8135591 => 8135591 D1406629 => D1406629 2001079061 => 2001079061

1 277 865 => 1277865 (EP1277865A1) 42 27 957 => 4227957

WO2011049527 =>2011049527 WO9619177 => 9619177 WO03065967 => 03065967 WO 03/092464 => 2003092464 WO 2006/017297 => 2006017297 WO-97-00959 => 199700959 WO 2007/149459 => 2007149459 DM/067366 => 067366

1/75580 => 200175580 (WO 1/75580) 03/092464 => 2003092464 2006/017297 => 2006017297 2006-239573 => 2006239573 2000-0063196 => 20000063196 (KR 2000-0063196) 201130040459.1 => 201130040459.1 (CN 201130040459.1)

WO 000033871.009 1791831-0001 (EM 1791831-0001) 1264303-001 (EC 1264303-001) JP62-61442A DES 3020280 (BAD DATA)