SKKbySSK / coast_audio

Real-Time audio processing library written in Dart.
MIT License
97 stars 12 forks source link

Issues with audio data processing and conversion #17

Closed ekuleshov closed 8 months ago

ekuleshov commented 12 months ago

First of all thank you for this amazing package. It looks very promising for decoding, playing and processing audio data.

So, I'm trying to use coast_audio to do a pitch detection on the audio data stream decoded from a wav file.

There are some general issues I mentioned in #13 (processing data without playing or encoding) and getting AudioTime when processing an AudioBuffer inside ProcessorNodeMixin.process() implementation.

To do pitch detection I used the pitch_detector_dart package, which implements AUBIO_YIN pitch tracking algorithm ported from TarsosDSP.

So, I used FftNode as an example and implemented a naive PitchNode class below. I'm struggling with getting data out of the audio buffer to get it in a format that works with the PitchDetector. This code kind of works, but I get false positive detections with frequency over 50kHz, which should not be there when sampling rate is 48000-ish.

Also, there should be a better way to work with the data buffer or use something like FrameRingBuffer as FftNode does, but I could not figure it out how to do that.

@SKKbySSK I really need your advise on this or maybe you could incorporate Yin algorithm in the coast_audio or coast_audio_fft packages for a general use.

Thank you in advance.

class PitchNode extends AutoFormatSingleInoutNode with ProcessorNodeMixin, SyncDisposableNodeMixin {
  PitchNode({
    required this.format,
    required this.bufferSize,
    required this.onPitchDetected,
    required this.position,
  }) : _pitchDetector = PitchDetector(format.sampleRate.toDouble() * 2, bufferSize);

  final AudioFormat format;
  final int bufferSize;
  final PitchDetector _pitchDetector;
  final PitchDetectedCallback onPitchDetected;
  final double Function() position;
  final List<double> _buffer = [];

  @override
  int process(AudioBuffer buffer) {
    if (buffer.sizeInFrames == 0) {
      return buffer.sizeInFrames;
    }

    double audioTime = position();

    List<double> list =  buffer.asInt32ListView().map((v) => v.toDouble()).toList();
    _buffer.addAll(list);

    bool detected = false;
    while (_buffer.length > bufferSize) {
      PitchDetectorResult result = _pitchDetector.getPitch(_buffer);
      if (result.pitched) {
        int pitch = result.pitch.toInt();
        onPitchDetected(audioTime, pitch);
        detected = true;
        break;
      }
      _buffer.removeRange(0, bufferSize);
    }

    if(detected) {
      while (_buffer.length > bufferSize) {
        _buffer.removeRange(0, bufferSize);
      }
    }

    return buffer.sizeInFrames;
  }  
  ...
SKKbySSK commented 11 months ago

I don't know about PitchDetector algorithm. But I can provide some hints.

ekuleshov commented 11 months ago

I don't know about PitchDetector algorithm. But I can provide some hints.

  • Is your bufferSize enough for detecting pitch?

How can I control buffer size that is passed to ProcessorNodeMixin.process() implementation? Or each processor suppose to maintain its own buffer? If so, could you give some example how to feed such buffer using data passed to process() method?

  • Does the PitchDetector support multi-channel audio data?

If I understand correctly in multi-channel buffers, e.g. 2 channels, every second value belong to the next channel. I tried to feed data only from one channel and could not get reliable output.

github-actions[bot] commented 8 months ago

This issue is stale because it has been open for 30 days with no activity.

github-actions[bot] commented 8 months ago

This issue was closed because it has been inactive for 7 days since being marked as stale.