Open mauriciovmc opened 1 year ago
Very impressive and useful work - Thank you for sharing the results!
I mostly read through https://www.dafx.de/paper-archive/2023/DAFx23_paper_13.pdf Could you clarify the intention with this added functionality? As far as I understand: 1 - you are adding 2 new project export parameters:
bins_per_oct_decim: IntProperty(
name="Bins/octave",
description=("Value for distributing the frequencies in log scale,"
"in bins/octave."),
default=18,
min=0,
max=36,
)
num_octaves_decim: IntProperty(
name="Num. of octaves",
description=("Number of higher octaves in which the frequencies will"
"be distributed in log scale."),
default=2,
min=0,
max=6,
)
2 - The default values will give about 3x performance gain in simulations up to 22.05kHz at the cost of <1 dB (or just 0,3 dB?) precision loss. More aggressive settings can give over 6x performance gain with still acceptable quality loss.
3 - The included post processing will automatically interpolate the log samplings into usual linear, so most of the code (HRIR) is not affected?
4 - to Bypass this new optimization and use normal, full quality Mesh2HRTF simulation user has to select num_octaves_decim = 0
?
5 - This optimization reduces computation time, but does not alter the RAM memory requirements.
We would need agreement from the team on the new export options and their defaults. Then the tests likely need to be updated. And I could try to update the tutorials to include new options when we have the code ready for testing.
Note, in the conclusions of the paper there is: "Besides, it will be explored the use of upsampling procedures that could correct or restore spectral detail lost when simulating HRTFs using low spectral resolution." I wonder if the focus should be more on smoothing/filtering high frequencies of the HRTF instead of restoring detail, because at high frequencies headphone re-seating, 3D simulation validity and sample-to-sample variances are likely causing more robustness problems than any extra high-frequency HRTF details can add. See some thoughts about post-processing in https://sourceforge.net/p/mesh2hrtf-tools/wiki/Mesh2HRTF%20options/
Dear Sergejs,
Very impressive and useful work - Thank you for sharing the results!
First of all, thank you for your interest in our research. I am glad you took the time to go through the paper and the code.
Could you clarify the intention with this added functionality? As far as I understand: 1 - you are adding 2 new project export parameters:
bins_per_oct_decim: IntProperty( name="Bins/octave", description=("Value for distributing the frequencies in log scale," "in bins/octave."), default=18, min=0, max=36, ) num_octaves_decim: IntProperty( name="Num. of octaves", description=("Number of higher octaves in which the frequencies will" "be distributed in log scale."), default=2, min=0, max=6, )
Perfect, that's it.
2 - The default values will give about 3x performance gain in simulations up to 22.05kHz at the cost of <1 dB (or just 0,3 dB?) precision loss. More aggressive settings can give over 6x performance gain with still acceptable quality loss.
Exactly. I am not sure what could be the best option for default values; using 18 bins/octave provides a very good resolution. Much cheaper simulations should provide similarly good results, as you mentioned. Basically, the precision loss you mentioned depends on what you measure. The overall log-spectral distortion measured 0.9dB; smoothing out the curves using third-octave filters provided the overall log-spectral distortion of 0.3dB; finally, if you consider bin-per-bin, the maximum average distortion in each case is around 3dB and 0.8dB at the very top of the spectrum, respectively (see fig. 9 of the paper).
3 - The included post processing will automatically interpolate the log samplings into usual linear, so most of the code (HRIR) is not affected?
Yes, this is correct. I implemented other functionalities to the code I have here, e.g. smoothing and correction for frequency ranges in which the simulations did not converge. But, they have nothing to do with the method itself, so I removed everything for this pull request.
4 - to Bypass this new optimization and use normal, full quality Mesh2HRTF simulation user has to select num_octaves_decim = 0 ?
Exactly. Either by setting 0 octaves or 0 bins/octave provides the simulation in full resolution.
5 - This optimization reduces computation time, but does not alter the RAM memory requirements.
Yes, but with much fewer "expensive" bins are to be simulated. I am not sure, but this could mean that more bins happen to be simulated in parallel more often this way since the average memory requirement per bin is dropped. This could also mean that the memory could more often be used closer to the maximum capacity, with proportionally more "cheap" bins available.
We would need agreement from the team on the new export options and their defaults. Then the tests likely need to be updated. And I could try to update the tutorials to include new options when we have the code ready for testing.
Fantastic, tell me if I could be of any help. I am sending here attached the code I adapted from yours to export the projects in Blender without using the GUI and I am also about to forward you an email I sent to Fabian some days ago with some examples of SOFA files generated with different resolutions, in case you want to have a look.
Note, in the conclusions of the paper there is: "Besides, it will be explored the use of upsampling procedures that could correct or restore spectral detail lost when simulating HRTFs using low spectral resolution." I wonder if the focus should be more on smoothing/filtering high frequencies of the HRTF instead of restoring detail, because at high frequencies headphone re-seating, 3D simulation validity and sample-to-sample variances are likely causing more robustness problems than any extra high-frequency HRTF details can add. See some thoughts about post-processing in https://sourceforge.net/p/mesh2hrtf-tools/wiki/Mesh2HRTF%20options/
Thank you for the comment. Indeed, I think that is the case.
Thanks for clarifications! let's wait for the feedback from the core team.
I find it very useful what you mentioned "I implemented other functionalities to the code I have here, e.g. smoothing and correction for frequency ranges in which the simulations did not converge." Hopefully this can be added in a separate pull request or if t really is "smoothing" it could be included as a separate function, a bit like these (which also use mesh2hrtf Python module): https://sourceforge.net/p/mesh2hrtf-tools/wiki/Common_HRTF_conversion_and_analysis/
This version implements a low-cost approximation using a hybrid linear-logarithmic sampling scale in frequency. For details, see the following publication: da Costa, M. V. M., Biscainho, L. W. P., & Oehler, M. (2023). Low-cost Numerical Approximation of HRTFs: a Non-Linear Frequency Sampling Approach. Proceedings of the 26th International Conference on Digital Audio Effects (DAFx23), Copenhagen, Denmark, 327-334.