xiph / opusfile

Stand-alone decoder library for .opus streams
BSD 3-Clause "New" or "Revised" License
148 stars 81 forks source link

op_open_file fails to open Opus files with the mapping family of 255 #23

Open polariton opened 3 years ago

polariton commented 3 years ago

Dear Sir/Madam,

The function op_open_file from opusfile version 0.12 fails to open the Opus files whose mapping family is 255 and the files with more than 8 channels (where mapping family is 255 by default). Function op_open_file returns the code OP_EIMPL value ("feature is not implemented"). On the other hand, files with up to 8 channels and with mappings 0 (mono/stereo) and 1 (surround) work fine.

This bug can be reproduced e.g. by trying to decode an Opus file with more than 8 channels with opusdec command-line tool. This bug does not exist in the older version of Opus tools. For example, opusdec from Ubuntu 20.04 works fine on the files with mapping family 255 and with >8 channels.

Please find attached an example of a problematic Opus file (compressed with ZIP due to GitHub restrictions): test16chFloat.zip. It can be properly encoded from WAV-file with opusenc, but opusdec cannot decode it.

Best, Stan

andykent commented 2 years ago

I think we are also seeing this.

We have 16 channel 3rd order ambisonic wav files.

Encoding seems to work as expected...

$ opusenc 16ch.wav 16ch.opus
Skipping chunk of type "LIST", length 62
Encoding using libopus 1.3.1 (audio)
-----------------------------------------------------
   Input: 48 kHz, 16 channels
  Output: 16 channels (16 uncoupled)
          20ms packets, 1024 kbit/s VBR
 Preskip: 312

Encoding complete                                
-----------------------------------------------------
       Encoded: 2 minutes and 32.02 seconds
       Runtime: 12 seconds
                (12.67x realtime)
         Wrote: 20320427 bytes, 7601 packets, 335 pages
       Bitrate: 1064.46 kbit/s (without overhead)
 Instant rates: 714.8 to 1895.2 kbit/s
                (1787 to 4738 bytes per packet)
      Overhead: 0.457% (container+metadata)

opusinfo reports things correctly...

$ opusinfo 16ch.opus 
Processing file "16ch.opus"...

New logical stream (#1, serial: 22b0bc07): type opus
Encoded with libopus 1.3.1, libopusenc 0.2.1
User comments section follows...
    ENCODER=opusenc from opus-tools 0.2
Opus stream 1:
    Pre-skip: 312
    Playback gain: 0 dB
    Channels: 16
    Original sample rate: 48000 Hz
    Streams: 16, Coupled: 0
    Channel Mapping Family: 255 Map: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]
    Packet duration:   20.0ms (max),   20.0ms (avg),   20.0ms (min)
    Page duration:    580.0ms (max),  456.5ms (avg),  340.0ms (min)
    Total data length: 20320427 bytes (overhead: 0.457%)
    Playback length: 2m:32.000s
    Average bitrate: 1069 kbit/s, w/o overhead: 1065 kbit/s
Logical stream 1 ended

opusdec chokes...

$ opusdec 16ch.opus 16ch-out.wav
Failed to open '16ch.opus'.

I also tried encoding to -mapping_family 2 (for ambisonics) using FFmpeg but this also returned the same error when trying to decode...

$ ffmpeg -i 16ch.wav -c:a libopus -mapping_family 2 16ch.ogg
ffmpeg version 4.4 Copyright (c) 2000-2021 the FFmpeg developers
  built with Apple clang version 12.0.5 (clang-1205.0.22.9)
  configuration: --prefix=/usr/local/Cellar/ffmpeg/4.4_2 --enable-shared --enable-pthreads --enable-version3 --cc=clang --host-cflags= --host-ldflags= --enable-ffplay --enable-gnutls --enable-gpl --enable-libaom --enable-libbluray --enable-libdav1d --enable-libmp3lame --enable-libopus --enable-librav1e --enable-librubberband --enable-libsnappy --enable-libsrt --enable-libtesseract --enable-libtheora --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --enable-libxvid --enable-lzma --enable-libfontconfig --enable-libfreetype --enable-frei0r --enable-libass --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libspeex --enable-libsoxr --enable-libzmq --enable-libzimg --disable-libjack --disable-indev=jack --enable-avresample --enable-videotoolbox
  libavutil      56. 70.100 / 56. 70.100
  libavcodec     58.134.100 / 58.134.100
  libavformat    58. 76.100 / 58. 76.100
  libavdevice    58. 13.100 / 58. 13.100
  libavfilter     7.110.100 /  7.110.100
  libavresample   4.  0.  0 /  4.  0.  0
  libswscale      5.  9.100 /  5.  9.100
  libswresample   3.  9.100 /  3.  9.100
  libpostproc    55.  9.100 / 55.  9.100
Guessed Channel Layout for Input Stream #0.0 : hexadecagonal
Input #0, wav, from '16ch.wav':
  Metadata:
    date            : 2021-05-01
    encoder         : Lavf58.76.100
    encoded_by      : REAPER
  Duration: 00:02:32.00, bitrate: 12288 kb/s
  Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 48000 Hz, hexadecagonal, s16, 12288 kb/s
File '16ch.ogg' already exists. Overwrite? [y/N] y
Stream mapping:
  Stream #0:0 -> #0:0 (pcm_s16le (native) -> opus (libopus))
Press [q] to stop, [?] for help
[libopus @ 0x7fee7e01a400] Unknown channel mapping family 2. Output channel layout may be invalid.
[libopus @ 0x7fee7e01a400] No bit rate set. Defaulting to 1024000 bps.
Output #0, ogg, to '16ch.ogg':
  Metadata:
    date            : 2021-05-01
    encoded_by      : REAPER
    encoder         : Lavf58.76.100
  Stream #0:0: Audio: opus, 48000 Hz, hexadecagonal, s16, 1024 kb/s
    Metadata:
      encoder         : Lavc58.134.100 libopus
      date            : 2021-05-01
      encoded_by      : REAPER
size=   19849kB time=00:02:31.99 bitrate=1069.8kbits/s speed=13.4x    
video:0kB audio:19759kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.455402%

From code I am calling op_test_callbacks and the error is OP_EIMPL (-130) The stream used a feature that is not implemented, such as an unsupported channel family.

Looking at the code it looks like this might actually be expected behaviour?

Is there any reason that restriction can't be lifted to support decoding more than 8 channels?

synthercat commented 2 years ago

I remember once bumping into a document that mentioned -mapping_family 3 too (I think that's for higher order ambisonics) you might want to try that

polariton commented 2 years ago

In the case of a linear microphone array, I have to use the mapping family 255 with one-to-one mapping of Opus streams to PCM channels. I already implemented my own tool on top of libopus to avoid stream coupling and permutation applied by opustools. Ambisonics mapping rearranges the channels as well as the Dolby Surround mapping. The problem is that opusfile library used by the latest version of opusdec does not support the mapping family 255.

andykent commented 2 years ago

For anyone stumbling onto this issue, opusfile seems to somewhat intentionally not support mapping family 255 decoding.

We have a branch which does have basic support for one-to-one channel mappings and upto 16 channels.

With these 2 changes things are working great.

jakar commented 2 years ago

@andykent Thanks for your branch! I made a comment here because it looks like it requires exactly 16 channels.

I was going to suggest a PR, but I suspect that, to get this merged, the whole "everything can always be converted to stereo, so you can just use the stereo API for everything" aspect of the library will have to be reconsidered. People might not be cool with that. On the other hand, 255 is a valid channel mapping, and (in my opinion) this library should handle all reasonable Opus streams.

WHUfreeway commented 1 year ago

@https://github.com/andykent I successfully run the examples code on my linux server, using FILE *myStream = fopen("output.wav","a"); and change all "stdout" to "myStream" in my code. however it seems to output an all zero stereo audio with wav header. I want to know if I was wrong using the code, or opusfile in this branch can accept 16 channel input but also 2 channel out?

andykent commented 1 year ago

@WHUfreeway no, the branch doesn't do any processing like that. If you pass 16 channels in you will get 16 channels out. You would need a renderer of some sort to down mix from 16 channels to stereo.

chris-hld commented 11 months ago

I would also vote for fixing this. Either way, an encoder, that produces a valid stream that can not be decoded is not consistent and surprising at least. From what I understand mapping channel mapping family 255 is specifically for multi-channel audio that does not fit in the usual categories, but still benefits from simultaneous coding. So I am not sure why that needs to conform to multiple sets of stereo streams? That seems like a lot of lost potential and might even introduce unwanted artifacts, e.g. in cases where employing stereophony based coding methods is not valid. Coding microphone array signals is a great example imo. The issue of not providing a "general-purpose" output for mapping family 255 is also explicitly mentioned here https://www.rfc-editor.org/rfc/rfc7845#section-5.1.1.3

@andykent I also tested your branch and it works great for up to 16 channels! The encoder can handle up to 255 channels, so couldn't this be allowed also at the decoder with your direct channel mapping with # define OP_NCHANNELS_MAX (255)? I would be actually very happy about a pull request to the main repository!

polariton commented 11 months ago

Many thanks for fixing this issue in the standard opustools. I second an opinion of @chris-hld for encoding and decoding of up to 255 channels because the setups in large rooms with several mic arrays may require more than 16 channels.

chris-hld commented 11 months ago

I opened a PR for opus-tools to support the independent channel mode encoding for low channel counts https://github.com/xiph/opus-tools/pull/80

chris-hld commented 7 months ago

This https://github.com/xiph/opusfile/pull/45 should now also fix channel Ambisonics mapping family 2 and 3 in opusdec.