hyriver / pygeohydro

A part of HyRiver software stack for accessing hydrology data through web services
https://docs.hyriver.io
Other
68 stars 23 forks source link

Rate Limiter on NLCD 2021 #125

Open colintle opened 3 weeks ago

colintle commented 3 weeks ago

What happened?

I am calling nlcd.bycoords() with a list of coordinates passed in. I keep getting connection errors below

Traceback (most recent call last): File "C:\Users\co986387\AppData\Local\anaconda3\envs\outage-map-api\Lib\site-packages\aiohttp\connector.py", line 1025, in _wrap_create_connection return await self._loop.create_connection(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\co986387\AppData\Local\anaconda3\envs\outage-map-api\Lib\asyncio\base_events.py", line 1121, in create_connection raise exceptions[0] File "C:\Users\co986387\AppData\Local\anaconda3\envs\outage-map-api\Lib\asyncio\base_events.py", line 1103, in create_connection sock = await self._connect_sock( ^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\co986387\AppData\Local\anaconda3\envs\outage-map-api\Lib\asyncio\base_events.py", line 1006, in _connect_sock await self.sock_connect(sock, address) File "C:\Users\co986387\AppData\Local\anaconda3\envs\outage-map-api\Lib\asyncio\selector_events.py", line 651, in sock_connect return await fut ^^^^^^^^^ File "C:\Users\co986387\AppData\Local\anaconda3\envs\outage-map-api\Lib\asyncio\selector_events.py", line 691, in _sock_connect_cb raise OSError(err, f'Connect call failed {address}') TimeoutError: [Errno 10060] Connect call failed ('152.61.141.32', 443)

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "\?\C:\Users\co986387\AppData\Local\anaconda3\envs\outage-map-api\Scripts\outage-map-script.py", line 33, in sys.exit(load_entry_point('Outage-Map-API', 'console_scripts', 'outage-map')()) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\co986387\AppData\Local\anaconda3\envs\outage-map-api\Lib\site-packages\click\core.py", line 1157, in call return self.main(args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\co986387\AppData\Local\anaconda3\envs\outage-map-api\Lib\site-packages\click\core.py", line 1078, in main rv = self.invoke(ctx) ^^^^^^^^^^^^^^^^ File "C:\Users\co986387\AppData\Local\anaconda3\envs\outage-map-api\Lib\site-packages\click\core.py", line 1688, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\co986387\AppData\Local\anaconda3\envs\outage-map-api\Lib\site-packages\click\core.py", line 1434, in invoke return ctx.invoke(self.callback, ctx.params) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\co986387\AppData\Local\anaconda3\envs\outage-map-api\Lib\site-packages\click\core.py", line 783, in invoke return __callback(args, *kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "c:\users\co986387\documents\outageprobabilityapi\outage_map\import_dss_cli.py", line 87, in import_dss covers = getLandCover(coords) ^^^^^^^^^^^^^^^^^^^^ File "c:\users\co986387\documents\outageprobabilityapi\outage_map\util\NetworkFunctions.py", line 134, in getLandCover land_usage_land_cover = gh.nlcd_bycoords(coords, years={"cover": [year]}) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\co986387\AppData\Local\anaconda3\envs\outage-map-api\Lib\site-packages\pygeohydro\nlcd.py", line 266, in nlcd_bycoords ds_list = [nlcd_wms.get_map(g, 30) for g in geoms] ^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\co986387\AppData\Local\anaconda3\envs\outage-map-api\Lib\site-packages\pygeohydro\nlcd.py", line 147, in get_map r_dict = self.wms.getmap_bybox(bbox, resolution, self.crs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\co986387\AppData\Local\anaconda3\envs\outage-map-api\Lib\site-packages\pygeoogc\pygeoogc.py", line 471, in getmap_bybox rbinary = ar.retrieve_binary( ^^^^^^^^^^^^^^^^^^^ File "C:\Users\co986387\AppData\Local\anaconda3\envs\outage-map-api\Lib\site-packages\async_retriever\async_retriever.py", line 638, in retrieve_binary return retrieve( ^^^^^^^^^ File "C:\Users\co986387\AppData\Local\anaconda3\envs\outage-map-api\Lib\site-packages\async_retriever\asyncretriever.py", line 433, in retrieve resp = [r for , r in sorted(tlz.concat(results))] ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\co986387\AppData\Local\anaconda3\envs\outage-map-api\Lib\site-packages\async_retriever\async_retriever.py", line 431, in results = (loop.run_until_complete(session(url_kwds=c)) for c in chunked_reqs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\co986387\AppData\Local\anaconda3\envs\outage-map-api\Lib\asyncio\base_events.py", line 687, in run_until_complete return future.result() ^^^^^^^^^^^^^^^ File "C:\Users\co986387\AppData\Local\anaconda3\envs\outage-map-api\Lib\site-packages\async_retriever\async_retriever.py", line 236, in async_session_with_cache
return await asyncio.gather(
tasks) # pyright: ignore[reportReturnType] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\co986387\AppData\Local\anaconda3\envs\outage-map-api\Lib\site-packages\async_retriever_utils.py", line 81, in retriever async with session(url, s_kwds) as response: File "C:\Users\co986387\AppData\Local\anaconda3\envs\outage-map-api\Lib\site-packages\aiohttp\client.py", line 1197, in aenter self._resp = await self._coro ^^^^^^^^^^^^^^^^ File "C:\Users\co986387\AppData\Local\anaconda3\envs\outage-map-api\Lib\site-packages\aiohttp_client_cache\session.py", line 80, in _request new_response = await super()._request(method, str_or_url, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\co986387\AppData\Local\anaconda3\envs\outage-map-api\Lib\site-packages\aiohttp\client.py", line 581, in _request conn = await self._connector.connect( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\co986387\AppData\Local\anaconda3\envs\outage-map-api\Lib\site-packages\aiohttp\connector.py", line 544, in connect proto = await self._create_connection(req, traces, timeout) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\co986387\AppData\Local\anaconda3\envs\outage-map-api\Lib\site-packages\aiohttp\connector.py", line 944, in _createconnection , proto = await self._create_direct_connection(req, traces, timeout) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\co986387\AppData\Local\anaconda3\envs\outage-map-api\Lib\site-packages\aiohttp\connector.py", line 1257, in _create_direct_connection raise last_exc File "C:\Users\co986387\AppData\Local\anaconda3\envs\outage-map-api\Lib\site-packages\aiohttp\connector.py", line 1226, in _create_direct_connection transp, proto = await self._wrap_create_connection( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\co986387\AppData\Local\anaconda3\envs\outage-map-api\Lib\site-packages\aiohttp\connector.py", line 1033, in _wrap_create_connection raise client_error(req.connection_key, exc) from exc aiohttp.client_exceptions.ClientConnectorError: Cannot connect to host www.mrlc.gov:443 ssl:default [Connect call failed ('152.61.141.32', 443)] PS C:\Users\co986387\Documents\outageProbabilityAPI>

Is there a limit to the number of requests I can make?

What did you expect to happen?

No response

Minimal Complete Verifiable Example

No response

MVCE confirmation

Relevant log output

No response

Anything else we need to know?

No response

Environment

pygeohydro==0.16.5
cheginit commented 2 weeks ago

This is a common issue when sending large web requests to web services in a short period of time, i.e., hammering the web service. HyRiver doesn't impose any rate limit, but on the server side, there might be a rate limiting mechanism in place. For large requests, I recommend batching your requests (say 100 points per batch), storing the results of each batch on disk, then concatenating the results, if needed. You can set a sleep time between batches, to avoid hammering the service.