daltonfury42 / truecase

A python true casing utility that restores case information for texts
Apache License 2.0
87 stars 16 forks source link

Lambda function for out_of_vocabulary_token_option #24

Open keshprad opened 3 years ago

keshprad commented 3 years ago

This is a solution for lambda function feature request in #23

  1. Created out_of_vocabulary_handler
    • Added ability to handle lambda function
      • the lambda function gets token_og_case, which has the original casing
    • I realized that the current release is implicitly using .lower() if an invalid option for out_of_vocabulary_token_option is passed (eg: out_of_vocabulary_token_option = 'upper')
      • This is due to L128.
      • I'm now passing the token with original casing to the handler function. If an invalid option is passed, I will use .title()
  2. Added test case for lambda function.
  3. Updated README for out_of_vocab options
keshprad commented 3 years ago

PR isn't ready to merge yet. I'm unsure how to address this, but would be a quick fix after we discuss