EventRegistry / event-registry-python

Python package for API access to news articles and events in the Event Registry
http://eventregistry.org/
MIT License
232 stars 54 forks source link

Timeout Error for some concepts #49

Closed farshidbalan closed 4 years ago

farshidbalan commented 4 years ago

I am querying for SP 500 companies. And my query is,

start_date = datetime.date(2020, 2, 10) end_date = datetime.date(2020, 2, 26) query = QueryArticlesIter(conceptUri='https://en.wikipedia.org/wiki/AbbVie_Inc.', categoryUri='dmoz/Business', lang=['eng'], dateStart=start_date, dateEnd=end_date, dataType = ["news", 'pr']) for q in query.execQuery(er, sortBy='date', sortByAsc=True, maxItems=10): print(q)

sometimes this query longs forever and some other times I get the following error (in German),

Event Registry exception while executing the request: ('Connection aborted.', TimeoutError(10060, 'Ein Verbindungsversuch ist fehlgeschlagen, da die Gegenstelle nach einer bestimmten Zeitspanne nicht richtig reagiert hat, oder die hergestellte Verbindung war fehlerhaft, da der verbundene Host nicht reagiert hat', None, 10060, None)) The request will be automatically repeated in 3 seconds...

Is there any solution to handle this error or to optimize my query and speed it up.

gregorleban commented 4 years ago

Hi, most of the time processing the query is for the category dmoz/Business. One thing you can try is to replace it with "news/Business". It's a very similar category, except that it doesn't have thousands of subcategories to include in the query so it will be computed much more efficiently.

Best, Gregor

On Wed, Feb 26, 2020 at 10:41 PM Farshid Balaneji notifications@github.com wrote:

I am querying for SP 500 companies. And my query is,

start_date = datetime.date(2020, 2, 10) end_date = datetime.date(2020, 2, 26) query = QueryArticlesIter(conceptUri=' https://en.wikipedia.org/wiki/AbbVie_Inc.', categoryUri='dmoz/Business', lang=['eng'], dateStart=start_date, dateEnd=end_date, dataType = ["news", 'pr']) for q in query.execQuery(er, sortBy='date', sortByAsc=True, maxItems=10): print(q)

sometimes this query longs forever and some other times I get the following error (in German),

Event Registry exception while executing the request: ('Connection aborted.', TimeoutError(10060, 'Ein Verbindungsversuch ist fehlgeschlagen, da die Gegenstelle nach einer bestimmten Zeitspanne nicht richtig reagiert hat, oder die hergestellte Verbindung war fehlerhaft, da der verbundene Host nicht reagiert hat', None, 10060, None)) The request will be automatically repeated in 3 seconds...

Is there any solution to handle this error or to optimize my query and speed it up.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/EventRegistry/event-registry-python/issues/49?email_source=notifications&email_token=AAGFVOQ6RDOY4EMD2BPAKX3RE3OY3A5CNFSM4K4OSFI2YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4IQTDWLA, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGFVOVQDPBRGZEDZR6MNH3RE3OY3ANCNFSM4K4OSFIQ .

--

http://eventregistry.org/

​Gregor Leban

CEO & Co-founder of Event Registry http://eventregistry.org/

Phone: 00386-31-321-804 | Skype: gregorleban

Find us on Twitter https://twitter.com/Event_Registry, Facebook https://www.facebook.com/eventregistrysystem/ or read our blog http://blog.eventregistry.org/

farshidbalan commented 4 years ago

It's much faster as you mentioned, thanks.