jina-ai / reader

Convert any URL to an LLM-friendly input with a simple prefix https://r.jina.ai/
https://jina.ai/reader
Apache License 2.0
6.24k stars 486 forks source link

Reader only returning the cookie banner #56

Open NicolasAmadori opened 3 months ago

NicolasAmadori commented 3 months ago

Whenever I attempt to access a page from the unibo.it domain that includes a cookie banner, Jina only returns the content of the banner itself.

Here's an example: https://r.jina.ai/https://www.unibo.it/it/ateneo/organizzazione-e-sedi/servizi-di-ateneo/servizi-online/servizi-online-per-studenti/guida-servizi-online-studenti/liste-di-distribuzione-docenti-studenti

However, when utilizing the x-respond-with header (with any type), all the page content is properly returned.

nomagick commented 3 months ago

The default return timing didn't work. You may manually specify your point of interest.

Try with our new x-target-selector header:

curl https://r.jina.ai/https://www.unibo.it/it/ateneo/organizzazione-e-sedi/servizi-di-ateneo/servizi-online/servizi-online-per-studenti/guida-servizi-online-studenti/liste-di-distribuzione-docenti-studenti -H 'x-target-selector: #content'