Y2Z / monolith

⬛️ CLI tool for saving complete web pages as a single HTML file
https://crates.io/crates/monolith
Creative Commons Zero v1.0 Universal
10.54k stars 304 forks source link

Site doesn't work, redirected towards ct.captcha-delivery.com #387

Open djjldoooo opened 3 weeks ago

djjldoooo commented 3 weeks ago

Hello, I try to save this page (or any page from this site):

chromium --headless --incognito --virtual-time-budget=10000 --dump-dom https://www.leboncoin.fr/ad/caravaning/2751571570 | monolith - -I -b https://www.leboncoin.fr/ad/caravaning/2751571570 -o 2751571570.html

[0623/101112.161528:WARNING:bluez_dbus_manager.cc(248)] Floss manager not present, cannot set Floss enable/disable.
libva error: /usr/lib/dri/i965_drv_video.so init failed
[0623/101112.315945:WARNING:sandbox_linux.cc(430)] InitializeSandbox() called with multiple threads in process gpu-process.
https://ct.captcha-delivery.com/c.js
https://geo.captcha-delivery.com/captcha/?initialCid=AHrlqAAAAAMAphgrd3tXKAQAsIJwQA%3D%3D&hash=05B30BD9055986BD2EE8F5A199D973&cid=8qX6CmJt0yEKBQIbInWwsv8EXUe~D4KlOEqg95h4WdNcfHtOWb590vh1Ua~cGs9wo_FQpxhUBKnJargVVm1lLSh3xw6FQrZC~Brn~ftxDAEyTiImmyK8fz57ex9ObzMn&t=bv&referer=https%3A%2F%2Fwww.leboncoin.fr%2Fad%2Fcaravaning%2F2751571570&s=2089&e=8677dc0e8f5150942b4957920ff46913a0423b473247a88fb3d04e3e377fda86&dm=cd
https://static.captcha-delivery.com/captcha/assets/tpl/6dc485c0c428c35b53577b146dc6f9179f55ef9ad41b327a2a179998839364bf/index.css
https://static.captcha-delivery.com/common/fonts/roboto/font-face.css
https://static.captcha-delivery.com/common/fonts/roboto/roboto.woff2
https://static.captcha-delivery.com/common/fonts/roboto/roboto.woff
https://static.captcha-delivery.com/captcha/assets/set/26e53713afb2da82d031f045066fa4b2a47b0733/logo.png?update_cache=520684889894581620
https://static.captcha-delivery.com/captcha/assets/tpl/6dc485c0c428c35b53577b146dc6f9179f55ef9ad41b327a2a179998839364bf/loading_spinner.gif

I manage to open the page in chromium without captcha manually though.

snshn commented 3 weeks ago

Maybe you could try it without the --incognito? This way the CAPTCHA session should be inherited and it'll think you're the real user. You could always "Save page as" from the browser and then use monolith on the file:/// to convert it into one monolithic HTML file, if all else fails.