Open mb21 opened 1 week ago
We have introduced a script injection mechanism to our API. Also inside the page, we provide these utility functions/event:
- waitForSelector(selector: string): Promise<HTMLElement>
waits for the selector to appear in the DOM
- simulateScroll(): void
simulates scrolling to the bottom of the page to trigger lazyload elements
- "mutationIdle" event on document
fires when the DOM mutation is idle in 200ms
See https://github.com/jina-ai/reader/issues/150 for example
Thanks! Seems curl ... --data-urlencode 'injectPageScript=document.addEventListener("mutationIdle", window.simulateScroll);'
should indeed work for this, I'll give it a try. Feel free to close this issue then.
We already have the
x-timeout
header, which works for a lot of javascript-heavy websites. But some websites lazy-load certain things only when you scroll down a bit.Therefore, I propose an
x-scroll
header, which would basically execute the following js after the page finished loading:(Pretty sure, 'smooth' scrolling triggers any IntersectionObservers in-between the top and the bottom of the page.)
And as soon as that's done and the event loop is empty, execute it again. As many times until either scrolling down doesn't expand the pages height anymore, or
x-timeout
is reached.