andygrundman / Audio-Scan

Audio::Scan - Fast Perl XS metadata and tag reader for all common audio file formats
GNU General Public License v2.0
3 stars 13 forks source link

wrong duration for wav files > 16 bit #2

Closed marcoc1712 closed 7 years ago

marcoc1712 commented 7 years ago

Hi,

I've found that using the Audio::Scan module in LMS 7.9.1 (Audio::Scan 0.9.5) when running over a wav file with precision greater than 16 bit we get a wrong duration, when i.e. sox expose the correct value:

Problem is that tracks are stored with this wrong duration so, when played, the progress bar is messed up and some controls like 'jump to time' does not works properly in plugins.

this is an example over a wav 32 bit 384000 Hz,.

look at song_length_ms vs. Duration.

AudioScan: {
  info => {
            audio_offset    => 80,
            audio_size      => 397516800,
            bitrate         => 24576000,
            bits_per_sample => 32,
            block_align     => 8,
            channels        => 2,
            file_size       => 397516880,
            format          => 65534,
            jenkins_hash    => 643161139,
            samplerate      => 384000,
            song_length_ms  => 6367,
          },
  tags => {},
} at F:\SVILUPPO\AudioScan\AudioScan.pl line 78.
"F:/Sviluppo/slimserver/Plugins/C3PO/Bin/MSWin32-x86-multi-thread/sox.exe --i \"F:\\SVILUPPO\\01 - SqueezeboxServer Plugins\\musica campione\\wav_32_384000.wav\""
0
(
  "\n",
  "Input File     : 'F:\\SVILUPPO\\01 - SqueezeboxServer Plugins\\musica campione\\wav_32_384000.wav'\n",
  "Channels       : 2\n",
  "Sample Rate    : 384000\n",
  "Precision      : 32-bit\n",
  "Duration       : 00:02:09.40 = 49689600 samples ~ 9705 CDDA sectors\n",
  "File Size      : 398M\n",
  "Bit Rate       : 24.6M\n",
  "Sample Encoding: 32-bit Signed Integer PCM\n",
  "\n",
)

here a table with results over different versions of same file (always 129.4 secs long).

file         offset size            ch     s/r      bit     secs     
wav 16 192000   44  99379200    2   192000  16  129.4   
wav 16 44100    44  22826160    2   44100   16  129.4    
wav 16 96000    44  49689600    2   96000   16  129.4    
wav 24 192000   80  149068800   2   192000  24  17.551   
wav 24 384000   80  298137600   2   384000  24  6.367
wav 32 384000   80  397516800   2   384000  32  6.367
andygrundman commented 7 years ago

Using Audacity I took a 24-96 FLAC and resampled it to several different formats. All of them retained the exact same duration value. I also used ffprobe to double-check things. Can you send me some sample files, maybe there's something else going on. Wonder if it's sox-related, because I didn't try with that. BTW I'm using version 0.96, although there shouldn't be any WAV-related changes in that vs 0.95.

File                            Bits  Samplerate    s_l_ms    ffprobe info
----------------------------------------------------------------------------------------------
24-96.flac (source)             24      96          224607    00:03:44.61, bitrate: 2614 kb/s
24-96.wav  (flac -d)            24      96          224607    00:03:44.61, bitrate: 4608 kb/s
32-96.wav  (Audacity export)    32      96          224607    00:03:44.61, bitrate: 6144 kb/s
32-192.wav (Audacity resample)  32      192         224607    00:03:44.61, bitrate: 12288 kb/s
32-384.wav (Audacity resample)  32      384         224607    00:03:44.61, bitrate: 24576 kb/s
marcoc1712 commented 7 years ago

Here some shorter (20 sec each) example: http://www.marcoc1712.it/downloads/20_Sec.rar

They are sox upasampled from the 'original' wav_16_044100.wav, that is produced from a longer flac file using FLAC, not sox.

Note that only the 32_384000 is wrong (8.85 secs) here and I've found that shorter files are always correct (I started producing a 2 sec file) so ...size matter! (at least for Audio::Scan).

A difference with your trials is that you started from an HIRez file.

Audio::Scan look for a strange 'fact' in header, don't know what is it, but without that test it will report the correct value always,

andygrundman commented 7 years ago

Thanks, I can reproduce it now.

andygrundman commented 7 years ago

Hi Marco, the problem was due to the math being done on the number of samples stored in the "fact" chunk. If num_samples*1000 exceeded 32 bits, it would overflow and give you a shorter song_length_ms value. The fact chunk is only used in files where the format is set to non-PCM. The files I made in Audacity were format=PCM even though according to the spec, anything > 16-bit is supposed to use WAVE_FORMAT_EXTENSIBLE (format=0xFFFE) with subtype PCM.

marcoc1712 commented 7 years ago

Hi Andy, thanks for your answer.

From what you say I could not understand if is SOX that is wrong, producing wav files with the wrong 'fact' chunk, if you are going to change the math in AudioScan, or maybe both.

For the little I know, wav is supposed to be pcm, but fact-ck is optional and used only when the " wave data" is not "wave.ck" but "wav-list", meaning you have different 'blocks' of data instead of just a big one, but is not the case here. Other than this fact-ck is mandatory also with 'compressed' pcm, but is wav supposed to store compressed data too?

2017-04-27 10:27 GMT+02:00 Andy Grundman notifications@github.com:

Hi Marco, the problem was due to the math being done on the number of samples stored in the "fact" chunk. If num_samples*1000 exceeded 32 bits, it would overflow and give you a shorter song_length_ms value. The fact chunk is only used in files where the format is set to non-PCM. The files I made in Audacity were format=PCM even though according to the spec, anything > 16-bit is supposed to use WAVE_FORMAT_EXTENSIBLE (format=0xFFFE) with subtype PCM.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/andygrundman/Audio-Scan/issues/2#issuecomment-297648023, or mute the thread https://github.com/notifications/unsubscribe-auth/ABMGptfsRvaeOmhipvCBpqq64jWAbqT3ks5r0FFngaJpZM4M96PC .

andygrundman commented 7 years ago

sox is working properly, and Audacity is too. I know I said non-PCM which is confusing, but what I really meant was that the header is just structured in a different way. You can have a "basic" header that includes the number of channels, samplerate, and bit depth, or an extended header that also includes the ordering of channels and a place to enter the exact number of samples. For files with that extra data (sox creates these), A::S uses it for the duration calculation. In the basic form (Audacity's), it's calculated from the size of the data divided by the bitrate.

The extra num_samples value for your test file was 7680000 (20 seconds at 384k), and the math for the duration is (7680000 * 1000) / 384000. The bug was that the top value here requires 33 bits to represent, but was being done in a 32-bit variable. The number gets truncated to a smaller number, leading to the wrong duration.

I just pushed out version 0.97 to CPAN, so let me know if this works for you.

marcoc1712 commented 7 years ago

Tank you, now I better understand. Il''have atry of A:S in my installation then I'll report to you and ask Michael to upgrade LMS to this version.

Many thanks for your tiime.

2017-04-27 16:48 GMT+02:00 Andy Grundman notifications@github.com:

sox is working properly, and Audacity is too. I know I said non-PCM which is confusing, but what I really meant was that the header is just structured in a different way. You can have a "basic" header that includes the number of channels, samplerate, and bit depth, or an extended header that also includes the ordering of channels and a place to enter the exact number of samples. For files with that extra data (sox creates these), A::S uses it for the duration calculation. In the basic form (Audacity's), it's calculated from the size of the data divided by the bitrate.

The extra num_samples value for your test file was 7680000 (20 seconds at 384k), and the math for the duration is (7680000 * 1000) / 384000. The bug was that the top value here requires 33 bits to represent, but was being done in a 32-bit variable. The number gets truncated to a smaller number, leading to the wrong duration.

I just pushed out version 0.97 to CPAN, so let me know if this works for you.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/andygrundman/Audio-Scan/issues/2#issuecomment-297736063, or mute the thread https://github.com/notifications/unsubscribe-auth/ABMGpnCpvZTS5QYYzS6kxWJ7KHEiWOuvks5r0KrZgaJpZM4M96PC .