WGBH-MLA / mlavalon

Apache License 2.0
0 stars 0 forks source link

Create transcoding process for NEH grant #130

Closed singlesoliloquy closed 3 years ago

singlesoliloquy commented 4 years ago

We need to develop a process to identify files that have arrived in the nehdigitization object store bucket, create proxies of these files using dockerized ffmpeg, and re-upload the proxy files back to the nehdigitization bucket so they can be accessed by the NEH Workflow Tracking database

Done When

Files can be automatically transcoded when they arrive in mla-digitized object store bucket.

rfraimow commented 4 years ago

FFmpeg script:

for .mkv or .dv files: ffmpeg -i [input file] -vcodec libx264 -pix_fmt yuv420p -b:v 711k -s 480:360 -acodec aac -ac 2 -b:a 128k -metadata creation_time=now [output file]

for .wav files: ffmpeg -i [input file] -acodec aac [output file]

File names should be stripped of their original extension (.mkv, .dv, .wav) and both audio and video proxy files should have an .mp4 extension.

Proxy files should be placed back into the nehdigitization bucket with the same prefix as the master files; for example, if the master file key was "23540/barcode267586/PreservationMaster/barcode267586_01.wav," the proxy file key should be "23540/barcode267586/PreservationMaster/barcode267586_01.mp4"

Please let me know if I can provide any more information for this ticket that should be helpful!

foglabs-zen commented 4 years ago

Met with drew/jason re: implentation details, created tickets for subtasks.

Will follow up with rebecca about nehdigitization bucket workflows.