max-planck-innovation-competition / go-uspto

Retrieve the latest patents as xml from the USPTO
Apache License 2.0
3 stars 0 forks source link
api-client go golang uspto

GO USPTO

Go Report Card Go Reference

Go API client for United States Patent and Trademark Office (USPTO) API

Status

Alpha Version

⚠️ Experimental - Not ready for production.

Standards

Grants

At the moment there are parsers that are tested with examples of the following versions of the USPTO XML format:

See dtd standards

Applications

See dtd standards

Installation

Add the package to your project via the following command:

go get github.com/max-planck-innovation-competition/go-uspto

Usage

Get a list of all the dates where the USPTO has published new patent grants:

import "github.com/max-planck-innovation-competition/go-uspto/pkg/uspto"

...

// get all download links for bulk xml patent archives between the following dates
loc, _ := time.LoadLocation("Europe/Berlin")
start := time.Date(2021, 9, 1, 0, 0, 0, 0, loc)
end := time.Date(2021, 10, 1, 0, 0, 0, 0, loc)
res, err := uspto.GetPatentXmlBulkFileList(start, end)

Download a bulk zip file from the USPTO:

exportFilePath, err := uspto.DownloadBulkFile("https://bulkdata.uspto.gov/data/patent/grant/redbook/fulltext/2021/ipg210907.zip", "./test-data")

Process the zip file

err := uspto.ProcessBulkFile("./test-data/pg020101.zip", "./test-data/pg020101/xml")

Process the xml file

patDoc, err := uspto.ProcessXMLFileSimple("./test-data/2-5-b1-patent.xml")

PatentID

USPTO IDs are generated in the following way

Patents

Applications

A Publication number includes a four-digit year, followed by a seven-digit sequence code followed by a two-character Kind Code that is assigned by the USPTO. The system displays the publication number with or without the "US" prefix and the Kind Code suffix (e.g., US YYYY-9999999 A9 or 9999-9999999). https://www.uspto.gov/patents/apply/applying-online/publication-number

Trademarks