There was a limitation in which sqlite was running into an error with a large number of placeholder values:
Error:
`Traceback (most recent call last):
File "C:/Repos/Inaturalist_research_scraper/iNatScraper_v2.0_dirty.py", line 90, in
for x in get_paged_identifications(unique_common_name[common_name], 5):
File "C:/Repos/Inaturalist_research_scraper/iNatScraper_v2.0_dirty.py", line 55, in get_paged_identifications
response = get_identifications(taxon_id=[taxon_id], per_page=items_per_page, page=1)
File "C:\Repos\Inaturalist_research_scraper\venv\lib\site-packages\forge_revision.py", line 328, in inner
return callable(*mapped.args, mapped.kwargs)
File "C:\Repos\Inaturalist_research_scraper\venv\lib\site-packages\pyinaturalist\v1\identifications.py", line 67, in get_identifications
identifications = get(f'{API_V1}/identifications', params).json()
File "C:\Repos\Inaturalist_research_scraper\venv\lib\site-packages\pyinaturalist\session.py", line 344, in get
return session.request('GET', url, kwargs)
File "C:\Repos\Inaturalist_research_scraper\venv\lib\site-packages\pyinaturalist\session.py", line 255, in request
verify=verify,
File "C:\Repos\Inaturalist_research_scraper\venv\lib\site-packages\pyinaturalist\session.py", line 290, in send
kwargs,
File "C:\Repos\Inaturalist_research_scraper\venv\lib\site-packages\requests_cache\session.py", line 206, in send
response = self._send_and_cache(request, actions, cached_response, kwargs)
File "C:\Repos\Inaturalist_research_scraper\venv\lib\site-packages\requests_cache\session.py", line 230, in _send_and_cache
response = super().send(request, kwargs)
File "C:\Repos\Inaturalist_research_scraper\venv\lib\site-packages\requests_ratelimiter\requests_ratelimiter.py", line 107, in send
max_delay=self.max_delay,
File "C:\Repos\Inaturalist_research_scraper\venv\lib\site-packages\pyrate_limiter\limit_context_decorator.py", line 66, in enter
self.delayed_acquire()
File "C:\Repos\Inaturalist_research_scraper\venv\lib\site-packages\pyrate_limiter\limit_context_decorator.py", line 82, in delayed_acquire
self.try_acquire()
File "C:\Repos\Inaturalist_research_scraper\venv\lib\site-packages\pyrate_limiter\limiter.py", line 105, in try_acquire
bucket.get(volume - item_count)
File "C:\Repos\Inaturalist_research_scraper\venv\lib\site-packages\pyrate_limiter\sqlite_bucket.py", line 120, in get
self.connection.execute(f"DELETE FROM {self.table} WHERE idx IN ({placeholders})", keys)
sqlite3.OperationalError: too many SQL variables
Process finished with exit code 1`
I fixed this with a small code change, introduce chunking of the execution to sqlite.
There was a limitation in which sqlite was running into an error with a large number of placeholder values:
Error: `Traceback (most recent call last): File "C:/Repos/Inaturalist_research_scraper/iNatScraper_v2.0_dirty.py", line 90, in
for x in get_paged_identifications(unique_common_name[common_name], 5):
File "C:/Repos/Inaturalist_research_scraper/iNatScraper_v2.0_dirty.py", line 55, in get_paged_identifications
response = get_identifications(taxon_id=[taxon_id], per_page=items_per_page, page=1)
File "C:\Repos\Inaturalist_research_scraper\venv\lib\site-packages\forge_revision.py", line 328, in inner
return callable(*mapped.args, mapped.kwargs)
File "C:\Repos\Inaturalist_research_scraper\venv\lib\site-packages\pyinaturalist\v1\identifications.py", line 67, in get_identifications
identifications = get(f'{API_V1}/identifications', params).json()
File "C:\Repos\Inaturalist_research_scraper\venv\lib\site-packages\pyinaturalist\session.py", line 344, in get
return session.request('GET', url, kwargs)
File "C:\Repos\Inaturalist_research_scraper\venv\lib\site-packages\pyinaturalist\session.py", line 255, in request
verify=verify,
File "C:\Repos\Inaturalist_research_scraper\venv\lib\site-packages\pyinaturalist\session.py", line 290, in send
kwargs,
File "C:\Repos\Inaturalist_research_scraper\venv\lib\site-packages\requests_cache\session.py", line 206, in send
response = self._send_and_cache(request, actions, cached_response, kwargs)
File "C:\Repos\Inaturalist_research_scraper\venv\lib\site-packages\requests_cache\session.py", line 230, in _send_and_cache
response = super().send(request, kwargs)
File "C:\Repos\Inaturalist_research_scraper\venv\lib\site-packages\requests_ratelimiter\requests_ratelimiter.py", line 107, in send
max_delay=self.max_delay,
File "C:\Repos\Inaturalist_research_scraper\venv\lib\site-packages\pyrate_limiter\limit_context_decorator.py", line 66, in enter
self.delayed_acquire()
File "C:\Repos\Inaturalist_research_scraper\venv\lib\site-packages\pyrate_limiter\limit_context_decorator.py", line 82, in delayed_acquire
self.try_acquire()
File "C:\Repos\Inaturalist_research_scraper\venv\lib\site-packages\pyrate_limiter\limiter.py", line 105, in try_acquire
bucket.get(volume - item_count)
File "C:\Repos\Inaturalist_research_scraper\venv\lib\site-packages\pyrate_limiter\sqlite_bucket.py", line 120, in get
self.connection.execute(f"DELETE FROM {self.table} WHERE idx IN ({placeholders})", keys)
sqlite3.OperationalError: too many SQL variables
Process finished with exit code 1`
I fixed this with a small code change, introduce chunking of the execution to sqlite.
PR: https://github.com/vutran1710/PyrateLimiter/pull/84