tvlabs / edge264

Simple H.264 decoder
BSD 3-Clause "New" or "Revised" License
45 stars 1 forks source link

edge264

Minimalist software decoder for the H.264 video format.

Features

Compiling and testing

edge264 is built and tested with GNU GCC and LLVM Clang, supports 32/64 bit architectures, and requires 128 bit SIMD support. Processor support is currently limited to Intel x86 or x64 with at least SSSE3. GLFW3 development headers should be installed to compile edge264_play. gcc-9 is recommended since it provides the fastest performance in practice. The build process will output an object file (e.g. edge264.o), which you may then use to link to your code.

$ make # automatically selects gcc-9 if available
$ ./edge264_test -d video.264 # replace -d with -b to benchmark instead of display
# optional, converts from MP4 format
$ ffmpeg -i video.mp4 -vcodec copy -bsf h264_mp4toannexb -an video.264

When debugging, the make flag TRACE=1 enables printing headers to stdout in HTML format, and TRACE=2 adds the dumping of all other symbols to stderr (very large). The automated test program can browse files in a given directory, decoding each <video>.264 file and comparing its output with the pair <video>.yuv if found. On the set of AVCv1, FRExt and MVC conformance bitstreams, 109/224 files are decoded perfectly, the rest using yet unsupported features.

$ ./edge264_test --help

Example code

Here is a complete example that opens an input file in Annex B format from command line, and dumps its decoded frames in planar YUV order to standard output. See edge264_test.c for a more complete example which displays frames.

#include <fcntl.h>
#include <unistd.h>
#include <sys/mman.h>
#include <sys/stat.h>
#include <sys/types.h>

#include "edge264.h"

int main(int argc, char *argv[]) {
    int f = open(argv[1], O_RDONLY);
    struct stat st;
    fstat(f, &st);
    uint8_t *buf = mmap(NULL, st.st_size, PROT_READ, MAP_SHARED, f, 0);
    Edge264_stream *s = Edge264_alloc();
    s->CPB = buf + 3 + (buf[2] == 0); // skip the [0]001 delimiter
    s->end = buf + st.st_size;
    int res;
    do {
        res = Edge264_decode_NAL(s);
        while (!Edge264_get_frame(s, res == -3)) { // drain remaining frames at end of buffer
            for (int y = 0; y < s->height_Y; y++)
                write(1, s->samples[0] + y * s->stride_Y, s->width_Y);
            for (int y = 0; y < s->height_C; y++)
                write(1, s->samples[1] + y * s->stride_C, s->width_C);
            for (int y = 0; y < s->height_C; y++)
                write(1, s->samples[2] + y * s->stride_C, s->width_C);
        }
    } while (res == 0 || res == -2);
    Edge264_free(&s);
    munmap(buf, st.st_size);
    close(f);
    return 0;
}

API reference

*`Edge264_stream Edge264_alloc()`** Allocate and return a decoding context, that is used to pass and receive parameters. The private decoding context is actually hidden at negative offsets from the pointer returned.

typedef struct Edge264_stream {
    // These fields must be set prior to decoding.
    const uint8_t *CPB; // should always point to a NAL unit (after the 001 prefix)
    const uint8_t *end; // first byte past the end of the buffer

    // These fields will be set when returning a frame.
    const uint8_t *samples[3]; // Y/Cb/Cr planes
    const uint8_t *samples_mvc[3]; // second view
    int8_t pixel_depth_Y; // 0 for 8-bit, 1 for 16-bit
    int8_t pixel_depth_C;
    int16_t width_Y;
    int16_t width_C;
    int16_t height_Y;
    int16_t height_C;
    int16_t stride_Y;
    int16_t stride_C;
    int32_t TopFieldOrderCnt;
    int32_t BottomFieldOrderCnt;
    int16_t frame_crop_offsets[4]; // {top,right,bottom,left}, in luma samples, already included in samples_Y/Cb/cr and width/height_Y/C
} Edge264_stream;

*`int Edge264_decode_NAL(Edge264_stream s)** Decode a single NAL unit, for whichs->CPBshould point to its first byte (containingnal_unit_type) ands->endshould point to the first byte past the buffer. After decoding the NAL,s->CPB` is automatically advanced past the next start code (for Annex B streams). Return codes are:

*`int Edge264_get_frame(Edge264_stream s, int drain)** Check the Decoded Picture Buffer for a pending displayable frame, and pass it ins`. While reference frames may be decoded ahead of their actual display (ex. B-Pyramid technique), all frames are buffered for reordering before being released for display:

Return codes are:

`void Edge264_free(Edge264_stream s)`** Deallocate the entire decoding context, and unset the stream pointer.

*`const uint8_t Edge264_find_start_code(int n, const uint8_t CPB, const uint8_t end)** Scan memory for the next three-byte 00n pattern, returning a pointer to the first following byte (orend` if no pattern was found).

Roadmap

Programming techniques

edge264 originated as an experiment on new programming techniques to improve performance and code simplicity over existing decoders. I presented a few of these techniques at FOSDEM'24 on 4 February 2024. Be sure to check the video!