andygrundman / Audio-Scan

Audio::Scan - Fast Perl XS metadata and tag reader for all common audio file formats
GNU General Public License v2.0
4 stars 13 forks source link

always add total_samples found in streaminfo #10

Closed philippe44 closed 8 months ago

philippe44 commented 3 years ago

A flac header (which is optional, as it is a streamable format) contains a "total_sample" entry which can be used to calculate duration. When unknown (which is legitimate), that entry must be set to 0

Audio::Scan, when that entry is 0, seeks to the end of the file to read samples counts but the problem is that if only a part of the file has been given (e.g. just read the header of an online file in a tmp resource), then samples count estimation is wrong and the caller has no idea if the total_sample can be used or not.

The idea is to add a field that always contains the untouched total_samples value from the streaminfo header

DrZing-dev commented 3 years ago

Sorry for not responding to this.

I'm not sure I understand the use case or the patch... your change simply writes the same value twice:

  my_hv_store( flac->info, "total_samples", newSVnv(flac->total_samples) );
  my_hv_store( flac->info, "total_samples_streaminfo", newSVnv(flac->total_samples) );

BTW, streaminfo and total_samples are required to be present in all valid FLAC files so I think you can generally rely on it. https://xiph.org/flac/format.html#metadata_block_streaminfo

FLAC defines several types of metadata blocks (see the format page for the complete list). Metadata blocks can be any length and new ones can be defined. A decoder is allowed to skip any metadata types it does not understand. Only one is mandatory: the STREAMINFO block. This block has information like the sample rate, number of channels, etc., and data that can help the decoder manage its buffers, like the minimum and maximum data rate and minimum and maximum block size.

andygrundman commented 3 years ago

Oops, was logged in on my other account, that's me. :)

philippe44 commented 3 years ago

Oops, was logged in on my other account, that's me. :)

Re total_samples, the field is required but can be zero when unknown "Total samples in stream. 'Samples' means inter-channel sample, i.e. one second of 44.1Khz audio will have 44100 samples regardless of the number of channels. A value of zero here means the number of total samples is unknown."

The issue is that when total_samples is 0, then Audio::Scan scans to the end of the file, read the last sample index and fills the header with that. See that code in flac_parse.

if (song_length_ms > 0) {
    my_hv_store( info, "bitrate", newSVuv( _bitrate(flac->file_size - flac->audio_offset, song_length_ms) ) );
  }
  else {
    if (!seeking) {
      // Find the first/last frames and manually calculate duration and bitrate
      off_t frame_offset;
      uint64_t first_sample;
      uint64_t last_sample;
      uint64_t tmp;

      DEBUG_TRACE("Manually determining duration/bitrate\n");

      Newz(0, flac->scratch, sizeof(Buffer), Buffer);

      if ( _flac_first_last_sample(flac, flac->audio_offset, &frame_offset, &first_sample, &tmp, 0) ) {
        DEBUG_TRACE("  First sample: %llu (offset %llu)\n", first_sample, frame_offset);

        // XXX This last sample isn't really correct, seeking back max_framesize will most likely be several frames
        // from the end, resulting in a slightly shortened duration. Reading backwards through the file
        // would provide a more accurate result
        if ( _flac_first_last_sample(flac, flac->file_size - flac->max_framesize, &frame_offset, &tmp, &last_sample, 0) ) {
          if (flac->samplerate) {
            song_length_ms = (uint32_t)(( ((last_sample - first_sample) * 1.0) / flac->samplerate) * 1000);
            my_hv_store( info, "song_length_ms", newSVuv(song_length_ms) );
            my_hv_store( info, "bitrate", newSVuv( _bitrate(flac->file_size - flac->audio_offset, song_length_ms) ) );
            my_hv_store( info, "total_samples", newSVuv( last_sample - first_sample ) );
          }

          DEBUG_TRACE("  Last sample: %llu (offset %llu)\n", last_sample, frame_offset);
        }
      }

      buffer_free(flac->scratch);
      Safefree(flac->scratch);
    }
  }

In case of streaming, when files have the total_samples set to 0 and we only pass a small chunk of data for Audio::Scan to analyze streaminfo, it returns an incorrect total_samples (and duration) pder code above. That's the reason why I copied the original total_samples twice so that when that later code adjust it, we still have the original value that is likely 0. Then, the caller can make an informed decision.

I can make that PR smarter by just adding the total_samples_streaminfo only when the original was zero or add a flag, something like added in the code above

my_hv_store( info, "length_estimated", newSVuv(1) );