Closed liudger closed 1 year ago
There are some fixes coming. Don't review yet
It should be good to go @hecomi
We could still do some optimisations for copy arrays. It now iterates over the arrays. We could use unsafe memcopy in the future
Thank you for your pull request. I'm curious about the effectiveness of adding the Delta part to the MFCC. At present, we simply compare pre-acquired MFCCs with current ones using simple distance calculations or cosine similarity. With this, I'd like to investigate how much this additional 12 dimensions actually contribute to the accuracy. If it proves beneficial, I'd be keen on accepting your PR. However, if the impact is minimal, I would prefer to keep the project simple and may consider putting it on hold for the time being. I'd appreciate your thoughts on this.
I didn't do actual scientific test to see how much it improves. But from just using it the result looked promising. I guess we can do a comparison by using an audio file and creating sample from this file for both calculation. (it's optional the calculation) and see what is more accurate. Currently I am already on another project so this testing would be done later.
Apologies for the delayed response. I have now reviewed your code.
There are changes that I would like to accept aside from the Delta calculation. It would be helpful if such changes could be provided in separate PRs in the future 🙇
As for the Delta calculation, I am considering accepting it as an experimental feature.
Please understand that I will be modifying your contributions to fit my coding style:
Thank you for your contributions and understanding.
I've read through the code, and it seems to be computing a weighted average instead of performing differentiation for the calculation of Delta MFCC. Is this code working as intended? Please correct me if I'm wrong.
Let me check again the calculation 🙈
I'll make separate pull request for the fixes and later make a new pull request for delta MFCC's
Add support for Delta calculation #25. Greatly improving the accuracy for identifying the phonemes.
Interface update and new length of mfcc array