Closed bofenghuang closed 1 year ago
Good idea! Happy to include it directly in the __call__
method of the PunctuationRestorer
:-)
Think both librosa
, scipy
(https://docs.scipy.org/doc/scipy/reference/generated/scipy.signal.resample.html) or torchaudio as an optional dependency as explained here: https://github.com/huggingface/speechbox/blob/main/CONTRIBUTING.md#philosophy
would make a lot of sense (checking the sampling rate exactly like you're doing in the example above :-))
I would slightly tend to scipy as it's pretty lightweight.
Would you like to open a PR for it? :-)
Sure. Thanks for the hints :)
Hi @patrickvonplaten 👋,
Thanks for this project!
I'm thinking we should have a possible audio resampling since
WhisperFeatureExtractor
doesn't do it inside.Below is an updated example. But it might be better to have it inside
PunctuationRestorer
to make it an out-of-box solution. What's your opinion? Willing to make a PR if necessary :)