internetarchive / brozzler

brozzler - distributed browser-based web crawler
Apache License 2.0
648 stars 96 forks source link

run yt-dlp after brozzling a page (if at all) #276

Closed galgeek closed 4 months ago

galgeek commented 4 months ago

brozzler should capture pages before video

vbanos commented 4 months ago

Improvement suggestion: browse_page also gets the HTTP status of the response. You can access it via browser.websock_thread.page_status. If its not 200, you shouldn't run yt-dlp.

vbanos commented 4 months ago

That's all from me, I think that this is a good improvement.