internetarchive / brozzler

brozzler - distributed browser-based web crawler
Apache License 2.0
648 stars 96 forks source link

skip yt-dlp for PDFs #268

Closed galgeek closed 8 months ago

galgeek commented 8 months ago

yt-dlp is re-capturing PDFs, unhelpfully.

... plus a couple of overdue requirements updates.