Open janbuchar opened 7 hours ago
@Mantisus
This is handled on client level, not SDK. Client is making requests, and saving stats on the rate limits (yes, its about the 429 status codes). Crawlee then reads this from the client.
https://github.com/apify/apify-client-js/blob/master/src/statistics.ts#L18 https://github.com/apify/crawlee/blob/master/packages/core/src/autoscaling/snapshotter.ts#L383
The only thing that SDK is doing here is switching the storage client to the apify client on the platform, which I know you have a bit differently in the python versions.
In Python SDK, we wrap the client instead of using it directly as a StorageManager
. Also, the Python API client does not seem to collect those statistics (please correct me if I'm wrong).
Also, the Python API client does not seem to collect those statistics (please correct me if I'm wrong).
Maybe, but that doesn't mean it shouldn't be implemented there, right? SDK is not making requests, the client is.
This is necessary for apify/crawlee-python#60 to bring benefit to running crawlee on Apify
Rate limit errors should have status code 429 (please recheck)