typesense / typesense-docsearch-scraper

A fork of Algolia's awesome DocSearch Scraper, customized to index data in Typesense (an open source alternative to Algolia)
https://typesense.org/docs/guide/docsearch.html
Other
95 stars 35 forks source link

[request] Add support for authenticating to Keycloak #39

Closed joostdecock closed 1 year ago

joostdecock commented 1 year ago

Description

Sites that are behind a Keycloak setup can't be scraped because the scraper currently can't authenticate to it.

Steps to reproduce

Try scraping a site that's behind a Keycloak setup.

Expected Behavior

A competing scraper supports this and even includes instructions in the README on how to configure it.

It would be nice if we could have a similar functionality.

Actual Behavior

The scraper will be redirected, and index the Keycloak site (if you configure it to) but doesn't know how to authenticate.

Metadata

Typesense Version: 0.24.1

OS: Debian Linux

jasonbosco commented 1 year ago

@joostdecock This seems to be the PR that adds Keycloak support in meilisearch docs-scraper: https://github.com/meilisearch/docs-scraper/pull/73.

Both Typesense and Meilisearch doc-scrapers are based on Algolia's docsearch-scraper. So the same set of changes when made in this repo should work.

Since I don't use Keycloack, would be great if you're able to do a PR that adds this support based on the above PR and test it with your setup.

joostdecock commented 1 year ago

@jasonbosco Sounds good. I'll give it a stab.