abdeladim-s / pywhispercpp

Python bindings for whisper.cpp
https://abdeladim-s.github.io/pywhispercpp/
MIT License
180 stars 26 forks source link

re-add "unnecessary" normalization to fix UTF-8 errors. #50

Closed UsernamesLame closed 2 months ago

UsernamesLame commented 2 months ago

I forgot to remove this line yesterday, I'm not sure why it's not causing any issues having it there referencing a nonexistent variable samples but it seems like normalization wasn't required based on transcription testing I've done so far.

abdeladim-s commented 2 months ago

Oh, I didn't notice yesterday, and I didn't run any tests. Next time please run the tests in the test directory before committing!

As for this, I think normalization is important, without it sometimes I get utf-8 decode problems. Please fix the code to arr /= np.iinfo(np.int16).max and commit it so I cam merge the fix asap.

Thanks!

UsernamesLame commented 2 months ago

Sorry about that. Slipped my mind to do the tests. Thankfully it didn't seem to actually break anything. That's the craziest part.

UsernamesLame commented 2 months ago

@abdeladim-s I re-added the normalization. I haven't had a chance to run the test suite. For now I'm relying on the CI pipeline to catch anything.

I re-added arr /= np.iinfo(np.int16).max

abdeladim-s commented 2 months ago

It's okey, no worries!
I, indeed, need to include the tests in CI, for now I am just running them locally :smile:

I'll merge this for now. Thanks a lot!