argmaxinc / WhisperKit

On-device Speech Recognition for Apple Silicon
https://takeargmax.com/blog/whisperkit
MIT License
3.17k stars 268 forks source link

Added implementation for SuppressTokensFilter #14

Closed jkrukowski closed 7 months ago

jkrukowski commented 7 months ago

If this PR looks good, I can try to implement other filters 🫡

ZachNagengast commented 7 months ago

Awesome contribution! Reviewing this shortly

ZachNagengast commented 7 months ago

Would love to see your take on the other filters if you're up for it! I'm going to be looking at word-level timestamps next on my side FYI, so we'll definitely need timestamp filtering i.e. ApplyTimestampRules equivalent for us, but keep in mind some of this logic we pulled into the SegmentSeeking protocol.

jkrukowski commented 7 months ago

Would love to see your take on the other filters if you're up for it! I'm going to be looking at word-level timestamps next on my side FYI, so we'll definitely need timestamp filtering i.e. ApplyTimestampRules equivalent for us, but keep in mind some of this logic we pulled into the SegmentSeeking protocol.

sure, happy to do it. Will try to find some time this or next week

jkrukowski commented 7 months ago

Would love to see your take on the other filters if you're up for it! I'm going to be looking at word-level timestamps next on my side FYI, so we'll definitely need timestamp filtering i.e. ApplyTimestampRules equivalent for us, but keep in mind some of this logic we pulled into the SegmentSeeking protocol.

hi @ZachNagengast, not sure if it's a good place to ask but you mentioned that some of this logic was pulled into SegmentSeeking protocol. Can you please elaborate a bit more? (I'm looking into implementing TimestampRulesFilter based on ApplyTimestampRules)