meedan / check

Development environment for Meedan Check, a collaborative media annotation platform
https://meedan.com/check
MIT License
125 stars 53 forks source link

Serverless function for video archiving #12

Closed infojunkie closed 1 year ago

infojunkie commented 3 years ago

Tell us about your request Create a serverless function that uses youtube-dl to perform video archiving. The function should emulate Pender's current video archiver which stores the output of youtube-dl onto an S3 bucket. Eventually (in another issue) this function will be integrated into Check (specifically, called by Pender as a new archiving provider that replaces the current MediaVideoArchiver.)

Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard? Archiving video media is an essential part of fact-checking, investigative and human-rights workflows. It is hard because video hosting platforms routinely take down videos, especially related to sensitive or controversial issues. In some cases, preserving this media is essential to build investigation reports or court cases.

Are you currently working around the issue? Using youtube-dl manually, and uploading the downloaded video to Check.

Implementation hints

parshnt commented 3 years ago

Hello @infojunkie, I would love to work on this issue.

I've built a similar function on GitHub actions to archive videos from a YouTube playlist using youtube-dl.

infojunkie commented 3 years ago

@parshnt Sounds great, feel free to submit a PR against the repo https://github.com/meedan/varcissus

parshnt commented 3 years ago

Hello, @infojunkie

due to some personal commitments and lack of time, I won't be able to work on this issue. apologies, you can go ahead & assign it to someone else who's up to work on it.

Pradyumn commented 3 years ago

Hey, I would like to work on this. Just wanted to know how are we supposed to trigger the function and tell which video to download?

Pradyumn commented 3 years ago

Can I work on this?

danielafeitosa commented 3 years ago

Hi @Pradyumn ! Are you still interested in working on this issue?

Just wanted to know how are we supposed to trigger the function and tell which video to download?

The serverless function should have an API endpoint to receive the requests. This URL will be used to trigger the function and must accept a URL.

You should use the prototype code as example. Starting from Varcissus, you need to: