ip-tools / uspto-opendata-python

A client library for accessing the USPTO Open Data APIs, written in Python.
https://docs.ip-tools.org/uspto-opendata-python/
MIT License
88 stars 23 forks source link

Unable to access the USPTO PBD system #5

Closed mohamedasni closed 5 years ago

mohamedasni commented 5 years ago

screenshot from 2018-10-02 17-07-51

amotl commented 5 years ago

Dear Mohamed,

thanks for writing in and reporting this problem. We just released uspto-opendata-python 0.8.0 which improves the error handling at the place you referenced. The software will now raise a more appropriate exception when running into this error condition.

Unfortunately, when running e.g.

$ uspto-pbd get "2017/0293197" --type=publication --format=xml

the USPTO API currently responds with

503 Service Unavailable: Back-end server is at capacity

Bummer.

We hope this upstream issue will get fixed and we will be happy to hear about it. You might want to report the issue to them in the meanwhile to let them know there are actually people using the service.

With kind regards, Andreas.


You can also observe the behavior by just requesting the base url:

$ http https://pairbulkdata.uspto.gov/

HTTP/1.1 503 Service Unavailable: Back-end server is at capacity
Connection: keep-alive
Content-Length: 0
Date: Wed, 03 Oct 2018 20:16:52 GMT
Via: 1.1 9f37c8b999ae2d6018396fda48773445.cloudfront.net (CloudFront)
X-Amz-Cf-Id: jo9nFTewqkX-fc5X8cPnXJKsyy6DdrFfumGapHBmYGIdIdc65eYrJg==
X-Cache: Error from cloudfront
amotl commented 5 years ago

As the API still responds with the same HTTP Status, i assume something is broken there. Can someone open a ticket at the USPTO?

amotl commented 5 years ago

Can someone open a ticket at the USPTO?

I just submitted an issue using their feedback form behind image on https://developer.uspto.gov/. Let's see whether there will be any answer.

andyhegedus commented 5 years ago

I am having similar issues. I have 0.8.0 version installed and trying the command line operation. Any tips on how to get this working?

uspto-pbd save "7654321" --type=patent --format=xml

2018-10-23 16:26:06,143 [uspto.util.client ] INFO : Querying for expression=patentNumber:(7654321), filter=[], sort=applId asc

Traceback (most recent call last): File "/Users/andreashegedus/anaconda3/lib/python3.6/site-packages/urllib3/connection.py", line 141, in _new_conn (self.host, self.port), self.timeout, **extra_kw) File "/Users/andreashegedus/anaconda3/lib/python3.6/site-packages/urllib3/util/connection.py", line 83, in create_connection raise err File "/Users/andreashegedus/anaconda3/lib/python3.6/site-packages/urllib3/util/connection.py", line 73, in create_connection sock.connect(sa) ConnectionRefusedError: [Errno 61] Connection refused

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/Users/andreashegedus/anaconda3/lib/python3.6/site-packages/urllib3/connectionpool.py", line 601, in urlopen chunked=chunked) File "/Users/andreashegedus/anaconda3/lib/python3.6/site-packages/urllib3/connectionpool.py", line 346, in _make_request self._validate_conn(conn) File "/Users/andreashegedus/anaconda3/lib/python3.6/site-packages/urllib3/connectionpool.py", line 850, in _validate_conn conn.connect() File "/Users/andreashegedus/anaconda3/lib/python3.6/site-packages/urllib3/connection.py", line 284, in connect conn = self._new_conn() File "/Users/andreashegedus/anaconda3/lib/python3.6/site-packages/urllib3/connection.py", line 150, in _new_conn self, "Failed to establish a new connection: %s" % e) urllib3.exceptions.NewConnectionError: <urllib3.connection.VerifiedHTTPSConnection object at 0x10e4bfd68>: Failed to establish a new connection: [Errno 61] Connection refused

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/Users/andreashegedus/anaconda3/lib/python3.6/site-packages/requests/adapters.py", line 440, in send timeout=timeout File "/Users/andreashegedus/anaconda3/lib/python3.6/site-packages/urllib3/connectionpool.py", line 639, in urlopen _stacktrace=sys.exc_info()[2]) File "/Users/andreashegedus/anaconda3/lib/python3.6/site-packages/urllib3/util/retry.py", line 388, in increment raise MaxRetryError(_pool, url, error or ResponseError(cause)) urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='pairbulkdata.uspto.gov', port=443): Max retries exceeded with url: /api/queries (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x10e4bfd68>: Failed to establish a new connection: [Errno 61] Connection refused',))

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/Users/andreashegedus/anaconda3/bin/uspto-pbd", line 11, in sys.exit(run()) File "/Users/andreashegedus/anaconda3/lib/python3.6/site-packages/uspto/pbd/command.py", line 78, in run run_command(client, options) File "/Users/andreashegedus/anaconda3/lib/python3.6/site-packages/uspto/util/command.py", line 127, in run_command result = acquire_single_document(client, options) File "/Users/andreashegedus/anaconda3/lib/python3.6/site-packages/uspto/util/command.py", line 156, in acquire_single_document result = client.download_document(query) File "/Users/andreashegedus/anaconda3/lib/python3.6/site-packages/uspto/util/client.py", line 250, in download_document response = self.query_patent(query['number']) File "/Users/andreashegedus/anaconda3/lib/python3.6/site-packages/uspto/util/client.py", line 277, in query_patent return self.query(expression) File "/Users/andreashegedus/anaconda3/lib/python3.6/site-packages/uspto/util/client.py", line 59, in query response = self.session.post(self.QUERY_URL, json=solr_query) File "/Users/andreashegedus/anaconda3/lib/python3.6/site-packages/requests/sessions.py", line 555, in post return self.request('POST', url, data=data, json=json, kwargs) File "/Users/andreashegedus/anaconda3/lib/python3.6/site-packages/requests/sessions.py", line 508, in request resp = self.send(prep, send_kwargs) File "/Users/andreashegedus/anaconda3/lib/python3.6/site-packages/requests/sessions.py", line 618, in send r = adapter.send(request, kwargs) File "/Users/andreashegedus/anaconda3/lib/python3.6/site-packages/requests/adapters.py", line 508, in send raise ConnectionError(e, request=request) requests.exceptions.ConnectionError: HTTPSConnectionPool(host='pairbulkdata.uspto.gov', port=443): Max retries exceeded with url: /api/queries (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x10e4bfd68>: Failed to establish a new connection: [Errno 61] Connection refused',))

amotl commented 5 years ago

Dear Andy,

thanks for writing in. The exception you are getting makes sense as https://pairbulkdata.uspto.gov currently does not respond to any requests. According to Russ Allen of http://pairbulk.historicip.com/, the service has been decommissioned:

Note that the uspto shutdown https://pairbulkdata.uspto.gov/ use https://ped.uspto.gov/peds

Bummer! On top of that, i didn't receive any answer on reporting the issue to the USPTO. Maybe someone coming here knows something about the background of this and would be so kind to share any insights?

With kind regards, Andreas.

amotl commented 5 years ago

Dear Mohamed (@mohamedasni) and Andy (@andyhegedus),

according to the recent observations, i just disabled the PBD subsystem and released version 0.8.2 of this package which will yield an appropriate error message when trying to use it and also reflects the decommissioning of the USPTO PBD system in its documentation.

Following the advice of Russ Allen, i checked whether the PEDS system still works. It does, so you might want to use the uspto-peds utility instead for acquiring respective data from the USPTO, e.g.

uspto-peds get "7654321" --type=patent --format=xml

Russ is also investigating the API, you might enjoy reading http://peds.historicip.com/. Please let me know if you need further assistance. I will be happy to hear back from you.

With kind regards, Andreas.

andyhegedus commented 5 years ago

Hi Andreas,

That is a real bummer especially since the PEDS database does not have the patent text I am interested in.

Curious: There is a java based GitHub project https://github.com/USPTO/PatentPublicData/tree/master/BulkDownloader https://github.com/USPTO/PatentPublicData/tree/master/BulkDownloader

that is accessing a different bulk data site, https://bulkdata.uspto.gov/ https://bulkdata.uspto.gov/

Could you python based tool be modified to work with this site. I don’t know java and am hardly even proficient in python. So I am looking for a tool that allows me to get text from specific patent ids.

Regards,

Andy Hegedus Founder AGH Analytics, LLC

An Engineering Services Firm

Learn Fast

1561 Ralston Ave Burlingame, CA 94010

andy.hegedus@aghanalytics.com M 650.619.1365

On Oct 24, 2018, at 7:25 AM, Andreas Motl notifications@github.com wrote:

Dear Mohamed (@mohamedasni https://github.com/mohamedasni) and Andy (@andyhegedus https://github.com/andyhegedus),

according to the recent observations, i just disabled the PBD subsystem and released version 0.8.1 of this package which will yield an appropriate error message when trying to use it.

Things like "uspto-peds get "7654321" --type=patent --format=xml" still seems to work. Please let me know if you need further assistance.

With kind regards, Andreas.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ip-tools/uspto-opendata-python/issues/5#issuecomment-432679200, or mute the thread https://github.com/notifications/unsubscribe-auth/AqXFBglN1kn86LZjNN8IGsIfCyg9DuDVks5uoHhMgaJpZM4XE3zF.

amotl commented 5 years ago

Dear Andy,

i believe the BDSS API is suitable for downloading gazettes only and not for acquiring information about individual patent applications or issued patents.

However, i am definitively looking forward to add further capabilities to the uspto-opendata-python package, for example to unlock access to these services:


[...] the PEDS database does not have the patent text I am interested in

Unfortunately no. May i point you to the Open Patent Services (OPS) offered by the European Patent Office? It offers an extensive API for accessing patent information from many offices. However, i don't know whether it also contains fulltext information from documents issued by the USPTO.

Cheers, Andreas.

amotl commented 5 years ago

The USPTO PBD system has been decommissioned, we are tracking the support for further data sources at #8.

Thanks again for your comments and your interest, @mohamedasni and @andyhegedus. I'm closing this now.