ittiam-systems / libmpegh

MPEG-H 3D Audio Low Complexity Profile Decoder. Encoder: https://github.com/ittiam-systems/libmpeghe Contact: mob-audio@ittiam.com
http://www.ittiam.com/
BSD 3-Clause Clear License
88 stars 18 forks source link

Questions. Where can I get a parameter file that can be applied to the decoder? #8

Closed bluewidy closed 2 years ago

bluewidy commented 2 years ago

I got a sony 360 reality audio .mp4 file. I want to enjoy this with my stereo headphones and 5.1 channel surround soundbar system at home.

However, a music player that can play MPEG-H 3D format files doesn't seem to exist, so I came to this GitHub. This software converts MHA1 format to PCM format, allowing you to enjoy music. I looked at the CLI commands to use this program.

First off I've done the basics. ia_mpeghd_testbench.exe -ifile:input.mp4 -ofile:result.wav

The result was a bit odd. I looked at the waveforms of 12 tracks through Audacity, and 1 track was blank. When I analyzed the mp4 file through the media info, it was stated that it was 7.1.4 channel, but the track extracted through the decoder is 7.1.3 channel, so something is wrong.

Anyway, my soundbar supports 5.1 channels, so I did the following: ia_mpeghd_testbench.exe -ifile:input.mp4 -ofile:result.wav -cicp:5.1

However, the number of channels extracted through the decoder is 5 channels, not 5.1 channels. So, I modified the command as follows: ia_mpeghd_testbench.exe -ifile:input.mp4 -ofile:result.wav -cicp:6

This worked. It was extracted with 5.1 channels. However, this also results in a bit weird. I looked at the waveforms of 6 tracks through Audacity, 1 track was blank. Obviously the decoder looks like something is wrong.

and I have another problem. It's about the parameters file. I don't know what the role of the parameter file is. However, I can make predictions like this:

Probably, in the case of 360 audio, the sound image should be positioned differently depending on the audio playback device. A parameter file will be needed to determine the position between the objects that make up the audio in response to these points. For example, the direction in which a sound image should be localized in stereo headphones and a direction in which a sound image should be localized in a 5.1-channel surround speaker system must be different, and 3D spatial information to which an object should be mapped is required differently depending on the playback device. It looks like it's a parameter file that contains that information, right?

So I want to get the parameter file. But I don't know how. Can I extract the parameters file from my own MPEG-H 3D format mp4 container? Or do I have to find another way?

howtofindthis
SakethSathuvalli commented 2 years ago

Hi,

Thanks for trying our decoder! From the description you provided(specifically “1 blank track”) it sounds to us that the 360 audio file that You are using is a purely object audio file with rendering to 7.1.4 setup as default. We can confirm this if it’s possible for You to share the stream with us.

Blank track: For purely object-based audio streams, the object audio renderer comes into picture for generating final output audio data. LFE Channel is not populated by the object rendering path and hence “the blank track”.

CICP: Coming to the command line option -cicp: the correct value to be used for 5.1 set up is 6 – CICP is an integer value that represents certain standard playback/rendering set ups. For your setup CICP value 6 will work. We see that this information is missing in our documentation/help/readme message – we will add the missing information at the earliest.

Parameter Files: The parameter files that you are referring to, are not created by the decoder, rather they are inputs from the user to the decoder. The implementations are based on the MPEG-H codec specifications.

For Your use case, if You want to change the playback set-up to a known standard set up, the -cicp: command line option must be used. If You are looking at binaural rendered audio(headphones), then use -brir: and pass the bitstream with binaural filter coefficients data(bitstream is expected to be compliant MPEG-H specification).

Thanks!

bluewidy commented 2 years ago

From the description you provided(specifically “1 blank track”) it sounds to us that the 360 audio file that You are using is a purely object audio file with rendering to 7.1.4 setup as default.

Yes it is true.

correct

The parameter files that you are referring to, are not created by the decoder, rather they are inputs from the user to the decoder. The implementations are based on the MPEG-H codec specifications. If You are looking at binaural rendered audio(headphones), then use -brir: and pass the bitstream with binaural filter coefficients data(bitstream is expected to be compliant MPEG-H specification).

You said that the parameter file was not generated by the decoder. Then how can I get "binaural filter coefficients data"?

SakethSathuvalli commented 2 years ago

You said that the parameter file was not generated by the decoder. Then how can I get "binaural filter coefficients data"?

The binaural filter coefficients data is usually measured for specific use cases by the end user of the decoder and hence the command line switch -brir: is provided for the decoder.

If You are looking at just rendering the output as a 2-channel data, You can use the command line option -cicp:2.

bluewidy commented 2 years ago

@SakethSathuvalli

Thank you very much. The puzzles are slowly coming together. The last strange thing I found is, when I input cicp:13, I expect an output of 12 channels(Since the number of channels in the original is 12 channels), but an audio track of 24 channels is output. and when I look at this through Audacity, 7 tracks are blank. What happened?

SakethSathuvalli commented 2 years ago

@SakethSathuvalli

Thank you very much. The puzzles are slowly coming together. The last strange thing I found is, when I input cicp:13, I expect an output of 12 channels(Since the number of channels in the original is 12 channels), but an audio track of 24 channels is output. and when I look at this through Audacity, 7 tracks are blank. What happened?

CICP value 13 stands for 22.2 layout. The CICP value for 7.1.4 is 19.

We will update our workspace with this information on CICP mapping at the earliest.

Thanks!

bluewidy commented 2 years ago

I thought that cicp:13 means 13 channels, cicp:14 means 14 channels, cicp:15 means 15 channels...etc. i'm stupid lol. Thank you! The question has been resolved, I close the issue.

bluewidy commented 2 years ago

We will update our workspace with this information on CICP mapping at the earliest.

Hmm... It's been quite some time since you wrote that comment. maybe forgot it?

SakethSathuvalli commented 2 years ago

We will update our workspace with this information on CICP mapping at the earliest.

Hmm... It's been quite some time since you wrote that comment. maybe forgot it?

https://github.com/ittiam-systems/libmpegh/tree/cli_info_updates

The changes are currently on this branch. They will be merged to main after review.

Thanks!

bluewidy commented 2 years ago

The changes are currently on this branch. They will be merged to main after review.

Thanks!

Thank you. I have a favor to ask. I want to read the MPEG-H specification document to learn bitstream syntax. Can you provide a link please?

SakethSathuvalli commented 2 years ago

The changes are currently on this branch. They will be merged to main after review. Thanks!

Thank you. I have a favor to ask. I want to read the MPEG-H specification document to learn bitstream syntax. Can you provide a link please?

The standard document can be procured from ISO website. Here's the link: https://www.iso.org/standard/74430.html

bluewidy commented 2 years ago

Thank you XD