pyinat / pyinaturalist

Python client for iNaturalist
https://pyinaturalist.readthedocs.io
MIT License
130 stars 17 forks source link

TimeoutError: The write operation timed out #535

Closed arky closed 6 months ago

arky commented 8 months ago

The problem

Sending create_observation fails intermittently. This might be due to slow response from API endpoint.

Expected behavior

Print an error about setting higher timeout or adding sleep between requests.

Steps to reproduce the behavior


    token="My_Secret_token"
    enable_logging()

    create_observation(
        taxon_id=40268, 
        observed_on_string="2023-12-17 18:27:29",
        time_zone="Bangkok",
        latitude="10.6122382",
        longitude="104.1774368",
        description=" some description",
        access_token=token
        )
---------------------------------------------------------------------------
TimeoutError                              Traceback (most recent call last)
File ~/.local/lib/python3.10/site-packages/urllib3/connectionpool.py:790, in HTTPConnectionPool.urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, preload_content, decode_content, **response_kw)
    789 # Make the request on the HTTPConnection object
--> 790 response = self._make_request(
    791     conn,
    792     method,
    793     url,
    794     timeout=timeout_obj,
    795     body=body,
    796     headers=headers,
    797     chunked=chunked,
    798     retries=retries,
    799     response_conn=response_conn,
    800     preload_content=preload_content,
    801     decode_content=decode_content,
    802     **response_kw,
    803 )
    805 # Everything went great!

File ~/.local/lib/python3.10/site-packages/urllib3/connectionpool.py:496, in HTTPConnectionPool._make_request(self, conn, method, url, body, headers, retries, timeout, chunked, response_conn, preload_content, decode_content, enforce_content_length)
    495 try:
--> 496     conn.request(
    497         method,
    498         url,
    499         body=body,
    500         headers=headers,
    501         chunked=chunked,
    502         preload_content=preload_content,
    503         decode_content=decode_content,
    504         enforce_content_length=enforce_content_length,
    505     )
    507 # We are swallowing BrokenPipeError (errno.EPIPE) since the server is
    508 # legitimately able to close the connection after sending a valid response.
    509 # With this behaviour, the received response is still readable.

File ~/.local/lib/python3.10/site-packages/urllib3/connection.py:409, in HTTPConnection.request(self, method, url, body, headers, chunked, preload_content, decode_content, enforce_content_length)
    408         else:
--> 409             self.send(chunk)
    411 # Regardless of whether we have a body or not, if we're in
    412 # chunked mode we want to send an explicit empty chunk.

File /usr/lib/python3.10/http/client.py:999, in HTTPConnection.send(self, data)
    998 try:
--> 999     self.sock.sendall(data)
   1000 except TypeError:

File /usr/lib/python3.10/ssl.py:1266, in SSLSocket.sendall(self, data, flags)
   1265 while count < amount:
-> 1266     v = self.send(byte_view[count:])
   1267     count += v

File /usr/lib/python3.10/ssl.py:1235, in SSLSocket.send(self, data, flags)
   1232         raise ValueError(
   1233             "non-zero flags not allowed in calls to send() on %s" %
   1234             self.__class__)
-> 1235     return self._sslobj.write(data)
   1236 else:

TimeoutError: The write operation timed out

During handling of the above exception, another exception occurred:

ProtocolError                             Traceback (most recent call last)
File ~/.local/lib/python3.10/site-packages/requests/adapters.py:486, in HTTPAdapter.send(self, request, stream, timeout, verify, cert, proxies)
    485 try:
--> 486     resp = conn.urlopen(
    487         method=request.method,
    488         url=url,
    489         body=request.body,
    490         headers=request.headers,
    491         redirect=False,
    492         assert_same_host=False,
    493         preload_content=False,
    494         decode_content=False,
    495         retries=self.max_retries,
    496         timeout=timeout,
    497         chunked=chunked,
    498     )
    500 except (ProtocolError, OSError) as err:

File ~/.local/lib/python3.10/site-packages/urllib3/connectionpool.py:844, in HTTPConnectionPool.urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, preload_content, decode_content, **response_kw)
    842     new_e = ProtocolError("Connection aborted.", new_e)
--> 844 retries = retries.increment(
    845     method, url, error=new_e, _pool=self, _stacktrace=sys.exc_info()[2]
    846 )
    847 retries.sleep()

File ~/.local/lib/python3.10/site-packages/urllib3/util/retry.py:470, in Retry.increment(self, method, url, response, error, _pool, _stacktrace)
    469 if read is False or method is None or not self._is_method_retryable(method):
--> 470     raise reraise(type(error), error, _stacktrace)
    471 elif read is not None:

File ~/.local/lib/python3.10/site-packages/urllib3/util/util.py:38, in reraise(tp, value, tb)
     37 if value.__traceback__ is not tb:
---> 38     raise value.with_traceback(tb)
     39 raise value

File ~/.local/lib/python3.10/site-packages/urllib3/connectionpool.py:790, in HTTPConnectionPool.urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, preload_content, decode_content, **response_kw)
    789 # Make the request on the HTTPConnection object
--> 790 response = self._make_request(
    791     conn,
    792     method,
    793     url,
    794     timeout=timeout_obj,
    795     body=body,
    796     headers=headers,
    797     chunked=chunked,
    798     retries=retries,
    799     response_conn=response_conn,
    800     preload_content=preload_content,
    801     decode_content=decode_content,
    802     **response_kw,
    803 )
    805 # Everything went great!

File ~/.local/lib/python3.10/site-packages/urllib3/connectionpool.py:496, in HTTPConnectionPool._make_request(self, conn, method, url, body, headers, retries, timeout, chunked, response_conn, preload_content, decode_content, enforce_content_length)
    495 try:
--> 496     conn.request(
    497         method,
    498         url,
    499         body=body,
    500         headers=headers,
    501         chunked=chunked,
    502         preload_content=preload_content,
    503         decode_content=decode_content,
    504         enforce_content_length=enforce_content_length,
    505     )
    507 # We are swallowing BrokenPipeError (errno.EPIPE) since the server is
    508 # legitimately able to close the connection after sending a valid response.
    509 # With this behaviour, the received response is still readable.

File ~/.local/lib/python3.10/site-packages/urllib3/connection.py:409, in HTTPConnection.request(self, method, url, body, headers, chunked, preload_content, decode_content, enforce_content_length)
    408         else:
--> 409             self.send(chunk)
    411 # Regardless of whether we have a body or not, if we're in
    412 # chunked mode we want to send an explicit empty chunk.

File /usr/lib/python3.10/http/client.py:999, in HTTPConnection.send(self, data)
    998 try:
--> 999     self.sock.sendall(data)
   1000 except TypeError:

File /usr/lib/python3.10/ssl.py:1266, in SSLSocket.sendall(self, data, flags)
   1265 while count < amount:
-> 1266     v = self.send(byte_view[count:])
   1267     count += v

File /usr/lib/python3.10/ssl.py:1235, in SSLSocket.send(self, data, flags)
   1232         raise ValueError(
   1233             "non-zero flags not allowed in calls to send() on %s" %
   1234             self.__class__)
-> 1235     return self._sslobj.write(data)
   1236 else:

ProtocolError: ('Connection aborted.', TimeoutError('The write operation timed out'))

During handling of the above exception, another exception occurred:

ConnectionError                           Traceback (most recent call last)
Cell In[225], line 17
     15 specimage=plot_guano(file)
     16 enable_logging()
---> 17 create_observation(
     18     taxon_id=40268,  # Bats (Order Chiroptera
     19     species_guess="Bats",
     20     observed_on_string=time,
     21     time_zone="Bangkok",
     22     latitude=lat,
     23     longitude=long,
     24     description=description,
     25     sounds=file,
     26     photos=specimage,
     27     access_token=token,
     28     #dry_run=True
     29     )
     30 time.sleep(300)

File /usr/local/lib/python3.10/dist-packages/forge/_revision.py:328, in Revision.__call__.<locals>.inner(*args, **kwargs)
    324 @functools.wraps(callable)  # type: ignore
    325 def inner(*args, **kwargs):
    326     # pylint: disable=E1102, not-callable
    327     mapped = inner.__mapper__(*args, **kwargs)
--> 328     return callable(*mapped.args, **mapped.kwargs)

File /usr/local/lib/python3.10/dist-packages/pyinaturalist/v1/observations.py:420, in create_observation(**params)
    417 response_json = response.json()
    418 observation_id = response_json['id']
--> 420 upload(observation_id, photos=photos, sounds=sounds, photo_ids=photo_ids, **kwargs)
    421 return response_json

File /usr/local/lib/python3.10/dist-packages/pyinaturalist/v1/observations.py:538, in upload(observation_id, photos, sounds, photo_ids, **params)
    536 sound_params['observation_sound[observation_id]'] = observation_id
    537 for sound in sounds:
--> 538     response = post(
    539         f'{API_V1}/observation_sounds', files=sound, session=session, **sound_params
    540     )
    541     responses.append(response)
    543 # Attach previously uploaded photos by ID

File /usr/local/lib/python3.10/dist-packages/pyinaturalist/session.py:376, in post(url, session, **kwargs)
    374 """Wrapper around :py:func:`requests.post` with additional options specific to iNat API requests"""
    375 session = session or get_local_session()
--> 376 return session.request('POST', url, **kwargs)

File /usr/local/lib/python3.10/dist-packages/pyinaturalist/session.py:256, in ClientSession.request(self, method, url, headers, json, access_token, allow_redirects, allow_str_ids, dry_run, expire_after, files, ids, only_if_cached, raise_for_status, refresh, stream, timeout, verify, **params)
    220 """Wrapper around :py:func:`requests.request` with additional options specific to iNat API requests
    221 
    222 Args:
   (...)
    242     API response
    243 """
    244 request = self.prepare_inat_request(
    245     method=method,
    246     url=url,
   (...)
    253     params=params,
    254 )
--> 256 response = self.send(
    257     request,
    258     dry_run=dry_run,
    259     expire_after=expire_after,
    260     only_if_cached=only_if_cached,
    261     refresh=refresh,
    262     timeout=timeout,
    263     allow_redirects=allow_redirects,
    264     stream=stream,
    265     verify=verify,
    266 )
    268 # Raise an exception if the request failed (after retries are exceeded)
    269 if raise_for_status:

File /usr/local/lib/python3.10/dist-packages/pyinaturalist/session.py:305, in ClientSession.send(self, request, dry_run, expire_after, refresh, retries, timeout, **kwargs)
    303 # Otherwise, send the request
    304 read_timeout = timeout or self.timeout
--> 305 response = super().send(
    306     request,
    307     expire_after=expire_after,
    308     refresh=refresh,
    309     timeout=(CONNECT_TIMEOUT, read_timeout),
    310     **kwargs,
    311 )
    312 response = self._validate_json(
    313     request,
    314     response,
   (...)
    318     **kwargs,
    319 )
    321 logger.debug(format_response(response))

File /usr/local/lib/python3.10/dist-packages/requests_cache/session.py:205, in CacheMixin.send(self, request, expire_after, only_if_cached, refresh, force_refresh, **kwargs)
    203     response = self._resend(request, actions, cached_response, **kwargs)  # type: ignore
    204 elif actions.send_request:
--> 205     response = self._send_and_cache(request, actions, cached_response, **kwargs)
    206 else:
    207     response = cached_response  # type: ignore  # Guaranteed to be non-None by this point

File /usr/local/lib/python3.10/dist-packages/requests_cache/session.py:229, in CacheMixin._send_and_cache(self, request, actions, cached_response, **kwargs)
    225 """Send a request and cache the response, unless disabled by settings or headers.
    226 If applicable, also handle conditional requests.
    227 """
    228 request = actions.update_request(request)
--> 229 response = super().send(request, **kwargs)
    230 actions.update_from_response(response)
    232 if not actions.skip_write:

File /usr/local/lib/python3.10/dist-packages/requests_ratelimiter/requests_ratelimiter.py:87, in LimiterMixin.send(self, request, **kwargs)
     77 """Send a request with rate-limiting.
     78 
     79 Raises:
     80     :py:exc:`.BucketFullException` if this request would result in a delay longer than ``max_delay``
     81 """
     82 with self.limiter.ratelimit(
     83     self._bucket_name(request),
     84     delay=True,
     85     max_delay=self.max_delay,
     86 ):
---> 87     response = super().send(request, **kwargs)
     88     if response.status_code in self.limit_statuses:
     89         self._fill_bucket(request)

File ~/.local/lib/python3.10/site-packages/requests/sessions.py:703, in Session.send(self, request, **kwargs)
    700 start = preferred_clock()
    702 # Send the request
--> 703 r = adapter.send(request, **kwargs)
    705 # Total elapsed time of the request (approximately)
    706 elapsed = preferred_clock() - start

File ~/.local/lib/python3.10/site-packages/requests/adapters.py:501, in HTTPAdapter.send(self, request, stream, timeout, verify, cert, proxies)
    486     resp = conn.urlopen(
    487         method=request.method,
    488         url=url,
   (...)
    497         chunked=chunked,
    498     )
    500 except (ProtocolError, OSError) as err:
--> 501     raise ConnectionError(err, request=request)
    503 except MaxRetryError as e:
    504     if isinstance(e.reason, ConnectTimeoutError):
    505         # TODO: Remove this in 3.0.0: see #2811

ConnectionError: ('Connection aborted.', TimeoutError('The write operation timed out'))

Workarounds

Is there an existing workaround for this issue?

Environment

JWCook commented 8 months ago

Thanks for the heads up. That error message could definitely be friendlier.

It must be intermittent issues on the iNat server side. Have you noticed this only with create_observation(), or also with read-only endpoints (like get_observations())? By default, pyinaturalist sends requests with a timeout of 10 seconds, and will retry failed requests up to 5 times with delays and exponential backoff. I could bump up the default timeout for create_observation(), maybe something like 30 seconds?

If you want, you can modify these settings yourself with a ClientSession object. Example:


from pyinaturalist import create_observation, ClientSession

session = ClientSession(timeout=30, max_retries=10)
create_observation(
    ...,
    session=session,
)
arky commented 8 months ago

@JWCook Am in Southeast Asia with bit of latency, so sending a post payload with json, image and audio file causing the above error every other requests. Thanks for the tip on using clientsession with custom timeout and retries options, this should be documented in docs, it would have saved me valuable time.

I haven't noticed any timeouts on get_observations yet, perhaps I need to spend some time to do some metrics and report back here.

JWCook commented 8 months ago

Good point about slower internet connections. I'll bump up the default timeout, and add some more info about that to the docs.

a post payload with json, image and audio file

It's also worth noting that each media file is uploaded in a separate request. So if, for example, you need to create an observation with 10 images and 10 audio files, you don't need to increase your timeout by 10x.

arky commented 7 months ago

@JWCook What finally seems to work well is to send each observation every 10 seconds and handle any timeout errors using python exceptions.

If an exception happens, then wait for another additional 10-20 seconds before sending any new create_observations.

Perhaps this backing off strategy seems to work for now. My guess is that iNat servers might have write latency and might respond badly if you send too many requests and file payloads too frequently.

arky commented 7 months ago

@JWCook After further testing, I strongly suspect the problem is with the add '{API_V1}/observation_sounds endpoint which seems to cause most amount of write errors causing timeouts.

JWCook commented 7 months ago

Hmm, in iNat's backend I don't think there's much difference in how it handles image uploads vs audio uploads. Could it be correlated with file size? Roughly how large are the files you're uploading? And are you still seeing the same type of error (TimeoutError('The write operation timed out'))?

Also, there is already rate limiting built in (per_second, per_minute, and, per_day parameters for ClientSession), but it currently doesn't go any slower than 1 request per second. That's something I could fix, though.

arky commented 7 months ago

@JWCook Am perplexed with this, there seems to be no correlation with the size of files. It is just that the server just gives up or stops responding triggering a time out error.

Will also play around with combination of per_* ClientSession options to see what's happening here.

JWCook commented 7 months ago

Small update: if you pip install -U requests-ratelimiter, you can now use floats as rate limit values (for <1 request per second). For example:

session = ClientSession(per_second=0.1)
arky commented 6 months ago

Am using both per_second setting of requests-ratelimiter and tenacity's retry method to manage to upload. Still get timeout errors regularly but slowly managing to upload the backlog of observations.


@retry(wait=wait_exponential(multiplier=10, min=10, max=360) ```
arky commented 6 months ago

@JWCook Shall we close this bug as this seems to be caused mainly by the iNat's servers capacity to handle audio files. What do you reckon?

JWCook commented 6 months ago

Yeah, I don't think there's much more we can do about this on the client side. Sorry, I wish I could be of more help!

On the iNat server side, something that would help is support for uploading files in small chunks, and resuming failed partial uploads. As far as I know, that's not currently possible.

After thinking about it a bit more, it would be possible to set up an intermediate service that emulates this using a cloud storage provider. Something like:

Let me know if that's something you'd be interested in looking into. I may not have time to work on something like that right now, but I could at least give some pointers and examples.

arky commented 6 months ago

Thank you @JWCook . Perhaps I'll reach out to iNat developers through NatGeo to improve their upload process as you suggested in future.