[request] Add support for authenticating to Keycloak

joostdecock commented 1 year ago

Description

Sites that are behind a Keycloak setup can't be scraped because the scraper currently can't authenticate to it.

Try scraping a site that's behind a Keycloak setup.

A competing scraper supports this and even includes instructions in the README on how to configure it.

It would be nice if we could have a similar functionality.

The scraper will be redirected, and index the Keycloak site (if you configure it to) but doesn't know how to authenticate.

Typesense Version: 0.24.1

OS: Debian Linux

jasonbosco commented 1 year ago

@joostdecock This seems to be the PR that adds Keycloak support in meilisearch docs-scraper: https://github.com/meilisearch/docs-scraper/pull/73.

Both Typesense and Meilisearch doc-scrapers are based on Algolia's docsearch-scraper. So the same set of changes when made in this repo should work.

Since I don't use Keycloack, would be great if you're able to do a PR that adds this support based on the above PR and test it with your setup.

joostdecock commented 1 year ago

@jasonbosco Sounds good. I'll give it a stab.