[Feature Request] Allow 3rd party crawling

hoarder-app / hoarder

A self-hostable bookmark-everything app (links, notes and images) with AI-based automatic tagging and full text search

https://hoarder.app

GNU Affero General Public License v3.0

3.36k stars 120 forks source link

[Feature Request] Allow 3rd party crawling #339

Open aaroneden opened 1 month ago

aaroneden commented 1 month ago

Allow connections to Zapier or systems like FireCrawl for more robust crawling

kamtschatka commented 1 month ago

from what I can see they only provide markdown, whereas the link scraping in hoarder uses html, so it would be more like a text bookmark and not a link bookmark.

From previous responses to issues like this, the intention is rather to keep hoarder clean instead of adding all kinds of integrations for all kinds of (paid) services. Can't you utilize the CLI and push the markdown you scraped using those services to hoarder?

MohamedBassem commented 2 weeks ago

Hoarder currently supports browserless (via BROWSER_WEBSOCKET_URL), given that it's the container that's used on unraid for chrome, and that it still keeps hoarder 3rd party provider agnostic.

We don't currently plan to support more 3rd party crawling unless there's strong demand from the community.