comet-ml / opik

Open-source end-to-end LLM Development Platform
Apache License 2.0
2.29k stars 141 forks source link

[OPIK-354] Improve SDK robustness to connection issues #721

Closed alexkuzmik closed 2 days ago

alexkuzmik commented 3 days ago

Details

  1. Added retries to fern-generated clients. Retries are configured via tenacity lib, which was added to the dependencies.
  2. Updated debug messages format to include more information about the timestamp, process, thread, logger, line number.
  3. Updated cookbooks that were missing pandas dependency.
  4. Restricted tokenizers dependency for py3.8 since it's broken for py3.8

Issues

Resolves the issue when the connection is expired during the request processing, request is failing and there are no more attempts to send it. SDK is more robust now to connectivity issues.

Testing

Testing was performed with the cookbooks github action workflow. It was modified to install opik locally instead of downloading it from pypi. The amount of connection/protocol errors reduced significantly and became way more rare, but it is still possible to see such failures sometimes.