sciencehistory / scihist_digicoll

Science History Institute Digital Collections
Other
13 stars 0 forks source link

Combined audio derivatives take a long time to fail when the originals are clearly bad. #2044

Closed jrochkind closed 1 year ago

jrochkind commented 1 year ago

This is not a bug per se: neither ingested audio file was playable, so the software was unable to create a combined audio derivative.

Backtrace

line 93 of [PROJECT_ROOT]/app/services/combined_audio_derivative_creator.rb: +
line 93 of [PROJECT_ROOT]/app/services/combined_audio_derivative_creator.rb: block in calculate_start_times
line 93 of [PROJECT_ROOT]/app/services/combined_audio_derivative_creator.rb: map

View full backtrace and more info at honeybadger.io

jrochkind commented 1 year ago

The honeybadger error said {"job" => "#<CreateCombinedAudioDerivativesJob>"}, but frustratingly seemed to lack the actual Job arguments/parameters that would tell us which OH work this was.

(I am going ot send an email to HoneyBadger support asking about this)

But Annabel figured it out from knowing what Rachel was working on: https://digital.sciencehistory.org/admin/works/mrwh24g#tab=nav-oral-histories

That has two audio members, and one of them, malcom_s_1117_1_1.flac was identified in our staff UI as having size 0 bytes and type application/octet-stream

So I believe that file is corrupt (Is "corrupt" the right term when it's 0 bytes? "Missing", I guess!)

Of course, since it's not known by our system to be audio, it shouldn't have even been included in the CreateCombinedAudioDerivativesJob -- our app thinks there's only one segment. Maybe there's a bug in CreateCombinedAudioDerivativesJob if you try to create one with only one segment?

We can/should fix that if so (I think you can still create one with only one segment, to convert eg FLAC to the proper format for web play?).

But it wouldn't resolve the underlying issue of course, of one of those audio files not being present.

Still needs further debugging.

eddierubeiz commented 1 year ago

I'm going to start looking into this!

eddierubeiz commented 1 year ago

I have pretty good handle on the problem and have created a draft PR that explains what's going on and fixes it (after a fashion).

For now, it's important to note that this was not a bug. The files were indeed corrupt, and the question is whether we can catch the problem in a more helpful and convenient way.