mendableai / firecrawl

🔥 Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.
https://firecrawl.dev
GNU Affero General Public License v3.0
19.13k stars 1.48k forks source link

[SDK] Added next handler for python sdk (js is ok) #880

Closed rafaelsideguide closed 2 weeks ago

nickscamara commented 2 weeks ago

We already have a next handler in the monitor job status itself. No?

while 'next' in status_data:
            status_response = self._get_request(status_data['next'], headers)
            status_data = status_response.json()
            data.extend(status_data['data'])
nickscamara commented 2 weeks ago

The check_crawl_status is not used anymore(?)

rafaelsideguide commented 2 weeks ago

@nickscamara User on discord tried to do async_crawl_url for a big page and retrieve the data with check_crawl_status (as we show on docs https://docs.firecrawl.dev/sdks/python) and the SDK cant retrieve all documents. He had to write a function himself for that.

nickscamara commented 2 weeks ago

Oh I see. Makes sense. Thanks @rafaelsideguide

nickscamara commented 2 weeks ago