mendableai / firecrawl

🔥 Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.
https://firecrawl.dev
GNU Affero General Public License v3.0
13.73k stars 993 forks source link

[Improv] JS SDK checkCrawlStatus returning incomplete response #387

Open jhoseph88 opened 2 months ago

jhoseph88 commented 2 months ago

Describe the Bug In the API docs, the /crawl/status/{jobId} advertises several parameters, current, total, current_step, and current_url, not present in response via the node sdk. I am seeing only data, partial_data, status, and success in the result for checkCrawlStatus. Would exposing this via JobStatusResponse be easy enough?

To Reproduce Steps to reproduce the issue:

  1. Initialize an app object with new FirecrawlApp
  2. Run app.crawlUrl(...
  3. Run await app.checkCrawlStatus(jobId); with jobId as the id of the job returned in the above crawl request
  4. Observe fields current, total, current_step, and current_url are missing from the response

Expected Behavior current, total, current_step, and current_url present in the response

rafaelsideguide commented 2 months ago

Hey @jhoseph88 !

Thanks for pointing out the mismatch in the API docs and the node SDK responses. It looks like we need to update the JobStatusResponse to include the missing fields.

Would you be up for helping us fix this? Your assistance would greatly benefit everyone using the SDK. If you could also update the SDK's end-to-end tests to align with these changes, that would be perfect.

Let me know if you’re interested, or if there’s any other way I can assist.

jhoseph88 commented 2 months ago

@rafaelsideguide Pardon my delay! Absolutely--I've submitted https://github.com/mendableai/firecrawl/pull/391. I've had to mark these missing fields as optional because they periodically end up as undefined in my local environment. I'll also point the following tests are failing for me locally:

This is likely due to my configuration.