mendableai / firecrawl

🔥 Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.
https://firecrawl.dev
GNU Affero General Public License v3.0
7.03k stars 504 forks source link

[BUG] Crawl on single page works on playground but fails via curl #231

Open calebpeffer opened 3 weeks ago

calebpeffer commented 3 weeks ago

Describe the Bug For some odd reason, the using the crawl endpoint on this link https://www.tripadvisor.com/Restaurant_Review-g60763-d4418144-Reviews-Reichenbach_Hall-New_York_City_New_York.html returns a single page on the playground, but returns nothing via curl request

To Reproduce Steps to reproduce the issue:

  1. Run URL in Playground
  2. Run /crawl via curl then hit the /getCrawl endpoint with the returned ID

Expected Behavior This should return the content for a single page

Screenshots

The page working via playground:

Screenshot 2024-06-03 at 6 05 21 PM

The page failing via curl:

336270082-0287cb4b-38a7-4555-be6b-b31ea8037f2e

Additional Context Brought up by MN via email

rafaelsideguide commented 3 days ago

@calebpeffer is there any updates on this one?