MycroftAI / lingua-franca

Mycroft's multilingual text parsing and formatting library
Apache License 2.0
73 stars 77 forks source link

Fix Farsi anchor date and normalizer #219

Closed krisgesling closed 2 years ago

krisgesling commented 2 years ago

Description

  1. Fixes the determination of current datetime by using the now_local function rather than datetime.now that relies on the system clock.
  2. Remove the Farsi Normalizer - it has not yet been implemented. The Normalizer had been copy-pasted from English. There does not seem to be a need for a Farsi Normalizer as yet.
  3. Clean up of imports and formatting

Extension from commit to Chatterboxes Lingua Nostra: https://github.com/HelloChatterbox/lingua-nostra/pull/32/commits/45a567c6b070f79477522febd97328daa8f98d1c

Type of PR

Testing

Have your system clock set to a different location than your Mycroft-core instance and use Farsi.

CLA

krisgesling commented 2 years ago

Hey @HKalbasi - wondering if you could take a look at this proposed bugfix?

HKalbasi commented 2 years ago

Sure.

The changes looks good, but I don't know where the normalizer is called exactly and what should be tested.

krisgesling commented 2 years ago

It's one of the localized functions so is executed when normalize() gets called and Farsi is the active language. A bit more on that here: https://github.com/MycroftAI/lingua-franca/#calling-localized-functions and here: https://github.com/MycroftAI/lingua-franca/blob/master/project-structure.md#localizing-functions

Users of the Lingua Franca library can use it to clean up an input string based upon common rules. In Mycroft-core we normalize utterances. For example if someone asks either "what's the time" or "what is the time", they will be normalized into the extended version "what is the time" and intent parsers only have to handle that extended form.

In it's simplest format the Normalizer will follow the rules outlined in: lingua_franca/res/text/fa-ir/normalize.json though looking at this file it is currently just a copy of the English json file. So this change won't do anything actually...

If there's not a localized Normalizer available I'm going to suggest we remove it for now.

HKalbasi commented 2 years ago

If there's not a localized Normalizer available I'm going to suggest we remove it for now.

I agree. There is nothing like I'd in Farsi. And there isn't article. We can add numbers but I think it is unnecessary?

krisgesling commented 2 years ago

Yeah agreed. Going to merge this in then :)

krisgesling commented 2 years ago

@chrisveilleux just looking for a rubber stamp on this one :)