harvard-lil / scoop

🍨 High-fidelity, browser-based, single-page web archiving library and CLI for witnessing the web.
MIT License
117 stars 8 forks source link

Skip video summary production if yt-dlp fails silently #377

Closed rebeccacremona closed 1 week ago

rebeccacremona commented 1 week ago

Background

We recently started capturing with --capture-video-as-attachment true in Perma.cc. We found that quite frequently, yt-dlp is in fact failing to extract a video or retrieve any metadata about a video, but is still returning 0 as though it had succeeded.

Examples:

(Use the RWP side menu to view Pages, and navigate to "Extracted video data")

Screenshot 2024-11-04 at 4 04 41 PM Screenshot 2024-11-04 at 4 04 50 PM

This PR

This PR makes the tiniest of tweaks:

I may not have added these tweaks idiomatically; happy to adjust in any way.

rebeccacremona commented 1 week ago

I made a few suggestions that you should feel free to ignore. This is great, thanks @rebeccacremona

Thank you @matteocargnelutti! This is exactly what I meant by, "I may not have added these tweaks idiomatically"