eshaz / codec-parser

Browser and NodeJS library that parses audio data into frames containing frame data, header values, duration, and other information.
GNU Lesser General Public License v3.0
25 stars 4 forks source link
aac audio flac hacktoberfest mp3 mpeg ogg opus vorbis

Codec Parser

codec-parser is a JavaScript library that parses raw data from audio codecs into frames containing data, header values, duration, and other information.

Supports:

Demo

The demo for icecast-metadata-js uses this library to allow for playback of streaming audio. codec-parser is used by mse-audio-wrapper to wrap streaming audio in ISOBMFF or WEBM so it can be played back using the MediaSource API.

View the live demo here!


Installing

Install via NPM

Usage

  1. Create a new instance of CodecParser by passing in the mimetype of your audio data along with the options object.

    Note: For directly reading from a HTTP response, use the mimetype contained in the Content-Type header

    import CodecParser from "codec-parser";
    
    const mimeType = "audio/mpeg";
    const options = {
        onCodec: () => {},
        onCodecUpdate: () => {},
        enableLogging: true
    };
    
    const parser = new CodecParser(mimeType, options);

Parsing an entire file

  1. To parse an entire audio file, pass in a Uint8Array of the entire audio file into the instance's .parseAll(). This method will read the all of the data and return an array of CodecFrames or OggPages.

    const frames = parser.parseAll(audioData);
    
    // Do something with the frames

Parsing chunks of audio

  1. To begin processing chunks of audio data, pass in a Uint8Array of audio data into the instance's .parseChunk(). This method returns an iterator that can be consumed using a for ...of or for await...of loop.

    for (const frame of parser.parseChunk(audioData)) {
      // Do something with each frame
    }

    or

    const frames = [...parser.parseChunk(audioData)]

    CodecParser will read the passed in data and attempt to parse audio frames according to the passed in mimeType. Any partial data will be stored until enough data is passed in for a complete frame can be formed. Iterations will begin to return frames once at least two consecutive frames have been detected in the passed in data.

    Note: Any data that does not conform to the instance's mimetype will be discarded.

    Example:

    • 1st .parseChunk() call
      • Input
        [MPEG frame 0 (partial)],
        [MPEG frame 1 (partial)], 
      • Output (no iterations)
        (none)
      • Frame 0 is dropped since it doesn't start with a valid header.
      • Frame 1 is parsed and stored internally until enough data is passed in to properly sync.
    • 2nd .parseChunk() call
      • Input
        [MPEG frame 1 (partial)], 
        [MPEG frame 2 (partial)]
      • Output (1 iteration)
        MPEG Frame 1 {
        data,
        header
        ...
        }
      • Frame 1 is joined with the partial data and returned since it was immediately followed by Frame 2.
      • Frame 2 is stored internally as partial data.
    • 3rd .parseChunk() call
      • Input
        [MPEG frame 2 (partial)],
        [MPEG frame 3 (full)], 
        [MPEG frame 4 (partial)]
      • Output (2 iterations)
        MPEG Frame 2 {
        data,
        header
        ...
        }
        MPEG Frame 3 {
        data,
        header
        ...
        }
      • Frame 2 is joined with the partial data and returned since it was immediately followed by Frame 3.
      • Frame 3 is returned since it was immediately followed by Frame 4.
      • Frame 4 is stored internally as partial data.
  2. When you have come to the end of the stream or file, you may call the instance's flush() method to return another iterator that will yield any remaining frames that are buffered. Calling flush() will reset the internal state of the CodecParser instance and may re-use the instance to parse additional streams.

    for (const frame of parser.flush()) {
      // Do something the buffered frames
    }

    or

    const frames = [...parser.flush()]

Instantiation

const parser = new CodecParser("audio/mpeg", options);

Methods

Properties

Data Types

Depending on the mimetype each iteration of CodecParser.parseChunk() will return a single CodecFrame or a single OggPage.

OggPage

OggPage describes a single ogg page. An OggPage may contain zero to many CodecFrame objects. OggPage will be returned when the mimetype is audio/ogg or application/ogg.

CodecFrame

CodecFrame describes a single frame for an audio codec. CodecFrame will be returned when the mimetype describes audio that is not encapsulated within a container i.e. audio/mpeg, audio/aac, or audio/flac.

Example

// First CodecFrame
MPEGFrame {
  data: Uint8Array(417),
  header: MPEGHeader {
    bitDepth: 16,
    channels: 2,
    sampleRate: 44100,
    bitrate: 128,
    channelMode: "joint stereo",
    emphasis: "none",
    framePadding: 1,
    isCopyrighted: false,
    isOriginal: true,
    isPrivate: false,
    layer: "Layer III",
    modeExtension: "Intensity stereo off, MS stereo on",
    mpegVersion: "MPEG Version 1 (ISO/IEC 11172-3)",
    protection: "none"
  },
  crc32: 275944052,
  samples: 1152,
  duration: 26.122448979591837,
  frameNumber: 0,
  totalBytesOut: 0,
  totalSamples: 0,
  totalDuration: 0
}

// Second CodecFrame
MPEGFrame {
  data: Uint8Array(416),
  header: MPEGHeader {
    bitDepth: 16,
    channels: 2,
    sampleRate: 44100,
    bitrate: 128,
    channelMode: "joint stereo",
    emphasis: "none",
    framePadding: 0,
    isCopyrighted: false,
    isOriginal: true,
    isPrivate: false,
    layer: "Layer III",
    modeExtension: "Intensity stereo off, MS stereo on",
    mpegVersion: "MPEG Version 1 (ISO/IEC 11172-3)",
    protection: "none"
  },
  crc32: 1336875295,
  samples: 1152,
  duration: 26.122448979591837,
  frameNumber: 1,
  totalBytesOut: 418,
  totalSamples: 1152,
  totalDuration: 26.122448979591837
}

CodecHeader

Each codec has it's own CodecHeader data type. See each class below for documentation on each codec specific header.

MPEGHeader

Documentation

{
  bitDepth: 16,
  bitrate: 192,
  channels: 2,
  sampleRate: 44100,
  channelMode: "joint stereo",
  emphasis: "none",
  framePadding: 1,
  isCopyrighted: false,
  isOriginal: false,
  isPrivate: false,
  layer: "Layer III",
  modeExtension: "Intensity stereo off, MS stereo on",
  mpegVersion: "MPEG Version 1 (ISO/IEC 11172-3)",
  protection: "16bit CRC"
}

AACHeader

Documentation

{
  bitDepth: 16,
  bitrate: 312,
  channels: 2,
  sampleRate: 44100,
  copyrightId: false,
  copyrightIdStart: false,
  channelMode: "stereo (left, right)",
  bufferFullness: "VBR",
  isHome: false,
  isOriginal: false,
  isPrivate: false,
  layer: "valid",
  length: 7,
  mpegVersion: "MPEG-4",
  numberAACFrames: 0,
  profile: "AAC LC (Low Complexity)",
  protection: "none"
}

FLACHeader

Documentation

{
  bitDepth: 16,
  bitrate: 400,
  channels: 2,
  sampleRate: 44100,
  channelMode: "stereo (left, right)",
  blockingStrategy: "Fixed",
  blockSize: 4096,
  frameNumber: 15183508,
  crc16: 56624,
  streamInfo: Uint8Array
}

OpusHeader

Documentation

{
  bitDepth: 16,
  bitrate: 192,
  channels: 2,
  data: Uint8Array,
  sampleRate: 48000,
  bandwidth: "fullband",
  channelMappingFamily: 1,
  channelMappingTable: [0, 1],
  coupledStreamCount: 1,
  streamCount: 1,
  channelMode: "stereo (left, right)",
  frameCount: 1,
  frameSize: 20,
  inputSampleRate: 48000,
  mode: "CELT-only",
  outputGain: 0,
  preSkip: 312
}

VorbisHeader

Documentation

{
  bitDepth: 32,
  bitrate: 272,
  channels: 2,
  channelMode: "stereo (left, right)",
  sampleRate: 44100,
  bitrateMaximum: 0,
  bitrateMinimum: 0,
  bitrateNominal: 320000,
  blocksize0: 256,
  blocksize1: 2048
  data: Uint8Array,
  vorbisComments: Uint8Array,
  vorbisSetup: Uint8Array
}