xiph / flac

Free Lossless Audio Codec
https://xiph.org/flac/
GNU Free Documentation License v1.3
1.67k stars 282 forks source link

Support for object-oriented audio? #531

Open sclsj opened 1 year ago

sclsj commented 1 year ago

Rationale:

Con:

What needs to be implemented:

ktmf01 commented 1 year ago

Do you have any idea whether this is possible without infringing on any patents?

Maybe if simply storing the ADM BWF data like any other WAV metadata is stored is combined with 8+ channel support, patents might not apply. It seems that simply storing the metadata unaltered might be considered trivial enough handling not to be covered by patents.

sclsj commented 1 year ago

I'm not sure. I can't figure out the patent situation after some research. However, it seems like ADM just stores metadata as XML in 'AXML' chunk of BWF, so maybe flac can do the same and store the XML in a metadata field with a special designated name.

In the other hand, if flac come up with its own implementation of storing object/scene metadata, maybe that won't infringe on any patent. That would also allow flac to be a viable alternative to common object-oriented file formats which are all patented (and need proprietary tools to encode/decode*).

*: Admittedly ffmpeg supports (e)ac3, but it does not support object encoding and decoding (correlation and decorrelation) which is an important part of eac3.

ktmf01 commented 1 year ago

Unless someone can point to clear prior art on the storage of object-based audio and the accompanying metadata (for example by finding an expired patent describing that), creating FLAC-specific object metadata seems way too risky for FLAC.

What does seem possible to me is expanding FLAC to store more than 8 channels (for example by defining how to multiplex FLAC streams in Matroska to achieve > 8 channels and implementing that) and then simply store the AXML chunk like FLAC already does for BWF.

That way, FLAC doesn't 'do' anything with the object audio metadata except storing it.

sclsj commented 1 year ago

I don't really know what resources are available for development and whether flac developers are interested to get into "Next Generation Audio" which I believe is just a fancy term for supporting objects and scene-based metadata. I just think it would be cool if there's an open and free NGA codec. But admittedly not everyone (i.e. nearly no one) will have the equipment to be able to tell apart a 7.1 channel downmix and say the 128 objects master, so there isn't a strong need for this.

Unless someone can point to clear prior art on the storage of object-based audio and the accompanying metadata (for example by finding an expired patent describing that), creating FLAC-specific object metadata seems way too risky for FLAC.

There is two published standards that I believe are all patented. One is for MPEG-H 3D (ISO 23003-4), one is for (e)ac3. However, they won't be of too much help because they are both lossy which is against flac principle/philosophy. The only lossless codec I'm aware of is Atmos TrueHD which does not have a published specification. TrueHD is partially supported by FFmpeg, but the object/Atmos part is not supported.

And to be honest, the most important part of eac3 (and the part not implemented by FFmpeg) is JOC, which itself is lossy, so flac probably won't need it. I think storing the AXML chunk and storing remaining objects as channels (across different flac streams, if I understand correctly) is about as much as flac can do in terms of staying lossless.

By the way, what's the reason for the 8 channel limit for flac?

ktmf01 commented 1 year ago

I don't really know what resources are available for development

There has been 1 commit in the last 30 days, so I'd say not a lot.

Unless someone can point to clear prior art on the storage of object-based audio and the accompanying metadata (for example by finding an expired patent describing that), creating FLAC-specific object metadata seems way too risky for FLAC.

There is two published standards that I believe are all patented.

The problem here is that it is not standards or implementations that can get patents. Patents protect ideas. So if someone patented the idea to "store metadata describing how to render channels with a time-varying component" than there is no way that idea can be implemented in FLAC without infringing on that patent.

By the way, what's the reason for the 8 channel limit for flac?

There's no more room in the streaminfo metadata block, see here. Also, there is not much room left in the frame headers either, see here.

ktmf01 commented 1 month ago

Do you perhaps know where I could find such files to test with?

H2Swine commented 3 weeks ago

I don't know what "object-based" features the files at https://www.bbc.co.uk/rd/publications/saqas employ, but you may try.

Also possibly relevant: #674