Open yoheikikuta opened 6 years ago
Bulk Data Storage System: https://bulkdata.uspto.gov/#pats From this system we use the following datasets for our analysis.
This dataset is composed of xml files (these are zip compressed) of each week's publication. Each xml file is as follows:
$ head -n 20 ipa170105.xml <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE us-patent-application SYSTEM "us-patent-application-v44-2014-04-03.dtd" [ ]> <us-patent-application lang="EN" dtd-version="v4.4 2014-04-03" file="US20170000001A1-20170105.XML" status="PRODUCTION" id="us-patent-application" country="US" date-produced="20161220" date-publ="20170105"> <us-bibliographic-data-application lang="EN" country="US"> <publication-reference> <document-id> <country>US</country> <doc-number>20170000001</doc-number> <kind>A1</kind> <date>20170105</date> </document-id> </publication-reference> <application-reference appl-type="utility"> <document-id> <country>US</country> <doc-number>14789882</doc-number> <date>20150701</date> </document-id> </application-reference> <us-application-series-code>14</us-application-series-code>
Important tags:
This dataset is composed of xml files (these are zip compressed) of each week's granted patent. Each xml file is as follows:
$ head -n 20 ipg120103.xml <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE us-patent-grant SYSTEM "us-patent-grant-v42-2006-08-23.dtd" [ ]> <us-patent-grant lang="EN" dtd-version="v4.2 2006-08-23" file="USD0651376-20120103.XML" status="PRODUCTION" id="us-patent-grant" country="US" date-produced="20111219" date-publ="20120103"> <us-bibliographic-data-grant> <publication-reference> <document-id> <country>US</country> <doc-number>D0651376</doc-number> <kind>S1</kind> <date>20120103</date> </document-id> </publication-reference> <application-reference appl-type="design"> <document-id> <country>US</country> <doc-number>29390372</doc-number> <date>20110423</date> </document-id> </application-reference> <us-application-series-code>29</us-application-series-code>
URL: https://bulkdata.uspto.gov/data/patent/office/actions/bigdata/2017/ See this pdf for detail.
How to use these datasets:
TARGET DATA
Bulk Data Storage System: https://bulkdata.uspto.gov/#pats From this system we use the following datasets for our analysis.
Patent Application Full Text Data (No Images) (MAR 15, 2001 - PRESENT)
This dataset is composed of xml files (these are zip compressed) of each week's publication. Each xml file is as follows:
Important tags:
Patent Grant Full Text Data (No Images) (JAN 1976 - PRESENT)
This dataset is composed of xml files (these are zip compressed) of each week's granted patent. Each xml file is as follows:
Important tags:
Patent Application Office Actions Research Dataset (Stata (.dta) and MS Excel (.csv)) (2008 - JUN 2017)
URL: https://bulkdata.uspto.gov/data/patent/office/actions/bigdata/2017/ See this pdf for detail.