Open Pijukatel opened 5 days ago
Before we look deeper into this, did you cross-check this against the JS version? Does it do the same thing there?
In js concurrency setting works, in here it doesn't
Yes, in JS it is different. I read it incorrectly earlier. So it seems like there was a mistake when porting the code to Python.
JS: const minCurrentConcurrency = Math.floor(this._desiredConcurrency * this.desiredConcurrencyRatio);
Python: min_current_concurrency = math.floor(self._desired_concurrency_ratio * self.current_concurrency)
Hm, then I might have made a mistake :slightly_smiling_face: Seems like an easy fix, right?
Yes, I will align the Python version with the JS version.
It seems to me that autoscaled_pool.desired_concurrency_ratio is currently completely useless.
It is in init here: https://github.com/apify/crawlee-python/blob/master/src/crawlee/_autoscaling/autoscaled_pool.py#L57
It is checked for bounds 0> desired_concurrency_ratio >=1 https://github.com/apify/crawlee-python/blob/master/src/crawlee/_autoscaling/autoscaled_pool.py#L97
It is used in only one place: https://github.com/apify/crawlee-python/blob/master/src/crawlee/_autoscaling/autoscaled_pool.py#L198
And from there in condition here: https://github.com/apify/crawlee-python/blob/master/src/crawlee/_autoscaling/autoscaled_pool.py#L202
That condition will always be true for currently runtime enforced values 0> desired_concurrency_ratio >=1:
self.current_concurrency >= math.floor(self._desired_concurrency_ratio * self.current_concurrency)
Did I miss something?