Guess filter format and settings for convolution filter file

JWahle commented 3 years ago

@HEnquist I would like to make it easier for users to add different types of convolution filter files to their pipelines.

I think, it should work similar to EqualizerAPO, where you just tell it the filename and the rest is set up automatically.

Formats, I would like to have supported:

.txt
.wav
.dbl (maybe? IIRC this is used by Acourate, but it can also generate .wav files, so this is not necessary)
anything else?

Also, multichannel IR should either be supported, or the user should get a warning. One way to support multichannel IRs would be to let the user for each filter in CDSP select the corresponding channel of the IR.

I would like to have CamillaDSP select defaults when filter type is Convolution and File. The other parameters should then just be optional. This way, not just users of the GUI benefit from this - and I have less to implement :) Another option might be, to just implement this in the GUI and backend and hide the remaining options, if the GUI thinks, it has sufficient defaults.

HEnquist commented 3 years ago

This is probably quite tricky to do well. The file ending probably tells if it's text or raw, but then separating the different formats or raw gets tricky. ".dbl" is likely f64, but ".raw" can be anything. Looking at the contents may give some clues, but is challenging.

Do you have some ideas for how to do this?

JWahle commented 3 years ago

The parameters, that we need to figure out are: format, skip_bytes_lines, read_bytes_lines and possibly channel_count - if you choose to support multichannel IRs. I guess, read_bytes_lines could be left to 0 for all formats.

I just looked this up and I think, .txt, .raw, .dbl should just be distinguished by extension, as they are raw formats without header - according to what I found. Here is a userguide for Adobe Auditions, that seems useful.

Here is, what I would expect for each file format:

.wav
- I think the header should contain all information, so nothing should be required of the user
- The first 4 bytes of a .wav file are RIFF, so this format could even be recognized, if incorrectly named
.txt
- is there any consensus on how a .txt audio file should look like? If not, I would let the user specify everything
.raw
- I would let the user specify everything except for skip_bytes_lines, which should probably always be 0
.dbl
- I didn't find an official spec, but what I found suggests, they are always float64
- skip_bytes_lines should be 0, since it's raw audio data
- so only channel count would be needed by the user (if at all)

For .raw and .dbl I would set the defaults and hide them by default - so the user can still overwrite, if necessary. For .wav I would set or hide all values and maybe show a warning, if the file is not supported. If the .wav handling is implemented in CDSP, it would provide better usability, if the GUI is not used. On the other hand, if it is implemented in the GUI, it would be easy to show a warning, if a file is not supported.

I'm not sure, if multichannel IRs should be supported, since they don't match with the configuration style of CDSP and it would only make things more complicated to additionally specify channel number and count. I would probably only do this, if we know of any room correction software, that can not create multiple single channel IRs. Until this is implemented, in the filter file selection dialog I would show a message, that only single channel IRs are supported.

HEnquist commented 3 years ago

I have considered support for multichannel raw files. I decided not to add that, since raw files are already quite (too?) difficult to use. Putting multiple channels in there makes it even worse.

txt: there are probably many formats out there. CamillaDSP supports only a simple format with one value per line, and with the option to skip a number of header lines and specifying the number of lines to read. I don't think this is used very much.
raw, pcm, dat etc: This can be any sample format, but most likely doesn't have any header to skip. And the whole file should probably be read.
dbl: not any official standard, but it's a pretty good guess that it's f64, no header, and that the whole file should be read.
wav: The current method is to analyze the wav with a python script (https://github.com/HEnquist/camilladsp/blob/master/testscripts/analyze_wav.py) to get the offset and length, and then loading the data chunk as raw. This only works for mono files. This could be improved by letting CamillaDSP parse the header itself.

For reading wav, there are somewhat limited choices. The most popular library is hound: https://github.com/ruuda/hound, but the development status doesn't look good. Other libraries are missing essential features. Of course there is also the option for adding the functionality directly in camilladsp. This might be the better way, since only the parsing of the header is missing.

There is also the option for letting the gui parse the wav and setting up the raw parameters automatically. Starting from this maybe: https://github.com/ffdead/wav.js/blob/master/wav.js

JWahle commented 3 years ago

I'll try to build your analyze_wav.py python script into the backend and provide the defaults to the UI from there. Then the GUI can also warn, if the .wav has an unsupported format.

JWahle commented 3 years ago

@HEnquist I pushed an implementation for this to the develop branch for the gui and backend. On filter file selection/upload, defaults are queried from the backend.

I am not sure, whether we should generally set read_bytes_lines = 0 for wave-files. I assume, if you configure multiple wav-files for different sample rates, some might be larger/have more coefficients than others. The assumption, that they have the same header size is probably correct, if both files were generated with the same program. Is that correct? According to the your python script, this wave-file (broken.wav.zip) has skip_bytes_lines = 36 and read_bytes_lines = 65536, but the file size is 65732 bytes. Also, the plot looks different, when I change read_bytes_lines from 65536 to 0. Is the file broken, or is read_bytes_lines = 0 invalid? If so, why?

HEnquist commented 3 years ago

I'll take a look! Wav-files can have more metadata after the data chunk, so read_bytes_lines should be set to the right length. Setting it to 0 probably works sometimes, but when it doesn't the user will get very strange results for no apparent reason.

I would guess that wav-files saved from the same program will likely have the same header length yes. But same as the previous point, if that isn't correct then the results won't be great.

Looking at broken.wav, I get this:

DEBUG:root:Found chunk of type fmt , length 16
DEBUG:root:Found chunk of type data, length 65536
DEBUG:root:Found chunk of type LIST, length 50
DEBUG:root:Found chunk of type cue , length 28
DEBUG:root:Found chunk of type LIST, length 50

The file size is 65732 bytes, and each chunk has 8 bytes of header that isn't included above. There is also a fixed 12-byte header at the start of the every wav-file.

65732 - (16+8) - (65536+8) - (50+8) - (28+8) - (50+8) - 12 = 0

The file seems just fine. But it actually has more metadata after the audio data, and then read_bytes_lines = 0 will read that as audio. For this one, it should be read_bytes_lines = 65536 and skip_bytes_lines = 36.

HEnquist commented 3 years ago

I left a few comments on the commit here: https://github.com/HEnquist/camillagui-backend/commit/59cf0c066da63b46faeb225843e4958b7c01f57c

HEnquist commented 3 years ago

I spent a little time today on reading the wav header in camilladsp. It's rather easy, give me a few days and it will be ready. Then the gui just need an update to handle the new config options. At the moment there is "File" and "Values". I'll add "Wav" to that, and an optional setting to select which channel to use (with 0 as default).

I'll also rename "File" to something more suitable, probably "Raw". And add an alias so configs with "File" still works as before.

HEnquist commented 3 years ago

The develop branch now has somewhat untested support for reading coeffiecients from wav.

JWahle commented 3 years ago

I left a few comments on the commit here: HEnquist/camillagui-backend@59cf0c0

The code changes are in develop, and the file split is in its own PR.

The develop branch now has somewhat untested support for reading coefficients from wav.

I'll implement that in the GUI. (Edit: ... nevermind - I'll create a PR for the README.md changes)

HEnquist commented 3 years ago

Dont bother about the readme! I alreafdy changed that, just haven't pushed the latest changes yet. That include the rename of "File" to "Raw". I'll do that tonight

HEnquist commented 3 years ago

I have pushed the new version now. What remains is to make the filter evaluator (in pycamilladsp-plot) also able to read wav. Reasonably easy since all the bits already exist, I just need to put them together.

HEnquist commented 3 years ago

I just pushed a new version of pycamilladsp_plot here: https://github.com/HEnquist/pycamilladsp-plot/tree/validation It can now read wav-files directly. The validation part seems to be working fine, and now gives lists of both errors and warnings. The errors are the things that would prevent the config from working. One neat thing here is that it can list many errors, not just the first one it finds like camilladsp itself does.

HEnquist commented 3 years ago

I'm closing this since this is all implemented.

HEnquist / camillagui

Guess filter format and settings for convolution filter file #40