Closed pkarman closed 9 years ago
passing in the format is one way to address part of the problem.
The crux is that soxi
, used often in audio_monster
, won't return good info if a file has no extension (thanks soxi
:unamused: )
Rather than passing it in, we can detect the file type using:
file --brief --mime-type some_local_file
or we can use the mimemagic
gem that does something similar (though I think file is probably better).
I use mimemagic
in the speechmatics
gem, and it has worked so far.
Given the type, we can set the extension of the created tempfile to use that, and everything will work.
Another option is to convert the soxi
dependent methods in audio_monster
to use ffmpeg
which does a better job of detecting type w/o file extension.
those all sound like good things to do, in addition to supporting explicitly passing in the format. +1
seems like the best option may be ffprobe, which can spit out nice json, and from the stream.codec and format.format_name you can figure out what the file should be pretty well.
ffprobe -v quiet -print_format json -show_format -show_streams test_file_no_extension
I have audio monster updated so it doesn't care about file extension anymore, and it is now relying on ffprobe
instead of soxi
. I also added better format detection to audio monster, so at least for audio it can be smarter about detecting the format regardless of file name.
Next up is updating fixer to use the latest, and to handle when original format is passed in via api call, or to detect it better when it isn't.
This was fixed by #39
The BaseProcessor class supports an
original_format
attribute which seems to allow for explicitly passing in the original file content type, esp when it cannot be determined from the URL.However there does not seem to be a way to set that attribute when POSTing a new job. I would expect the Job API to accept a 'original_format' attribute to mirror the 'original' attribute.