freelawproject / doctor

A microservice for document conversion at scale
https://free.law/projects/doctor
BSD 2-Clause "Simplified" License
54 stars 14 forks source link

`fetch_audio_duration` is not working properly #192

Open grossir opened 3 months ago

grossir commented 3 months ago

I am copying this graph from freelawproject/courtlistener#440 which shows that for the same duration, we get different file sizes when querying the actual bucket (and checking the length of the downloaded bytes)

image Another more colorful graph that takes the year from the date_created shows that the problem is from late 2019 to the present image

Examples of wrong and correct durations:

Code that needs correcting: https://github.com/freelawproject/doctor/blob/4009f00f08c5c98dcc797da69320e4dc635b3372/doctor/views.py#L354

mlissner commented 3 months ago

Delightful. I wonder what happened in late 2019? Maybe a new version of eyed3? Hm. I'm not sure I really care about this being wrong, because it doesn't really affect much, but it would be nice if it were more accurate.