Open Akz47 opened 9 months ago
Hi @Akz47,
Thank you for bringing this matter to our attention. It is currently not feasible to retrain the Audio Classifier using our model maker. We acknowledge this as a feature request and will share it with our team. Regarding any other inquiries you may have, we will assign issue to the appropriate owner for further assistance.
Thank you!!
Hi @Akz47,
Thank you for bringing this matter to our attention. It is currently not feasible to retrain the Audio Classifier using our model maker. We acknowledge this as a feature request and will share it with our team. Regarding any other inquiries you may have, we will assign issue to the appropriate owner for further assistance.
Thank you!!
Hi @kuaashish,
Thanks for your reply.
Would it work if we used TFLite's Model Maker to train the custom audio classification model, then import that model into MediaPipe?
Reference: https://www.tensorflow.org/lite/models/modify/model_maker/audio_classification
It would be very helpful if you could please recommend some good, compatible approaches.
Hi @joezoug,
Could you please provide any pointers here? Thank you!!
Hi @kuaashish,
In MediaPipe's AudioClassifier documentation, the AudioClassifierOptions doesn't seem to allow for the customization of hop duration.
We are trying to classify shorter sound events, and a 1-second hop / window might be inaccurate or overlook these events.
Based on online literature we found, it seems that Yamnet's PATCH_HOP_SECONDS can be customized: https://groups.google.com/g/audioset-users/c/pRDX6AkaM1s
Is there a way to set the PATCH_HOP_SECONDS parameter within your Classifier options, or directly within the source code?
Thank you.
MediaPipe Solution (you are using)
Audio Classifier
Programming language
Python
Are you willing to contribute it
None
Describe the feature and the current behaviour/state
We would like to train a custom audio classifier using Model Maker, but the module appears to support image / text classifiers only. What is the best way for audio transfer learning using MediaPipe?
Will this change the current API? How?
No response
Who will benefit with this feature?
Users training custom audio classes / sound events
Please specify the use cases for this feature
Detect and classify custom sounds instead of the default Yamnet model ones
Any Other info
No response