MycroftAI / lingua-franca

Mycroft's multilingual text parsing and formatting library
Apache License 2.0
75 stars 79 forks source link

Move runtime config (default language, localization settings, etc) to a stateful object #128

Open ChanceNCounter opened 4 years ago

ChanceNCounter commented 4 years ago

Hear me out 😛

The module is (or will soon be) thoroughly respectful of a given language's default rules (for instance, short vs. long scale numbers.) We've also got some, and will obviously extend, support for full localization (en_US, en_GB, etc.)

But what if the user only wants some of their language's "default" rules? It's easy to look at this as an implementation detail, but the values in question are effectively default parameters to our functions. If we read this information onto a stateful object, the implementation (such as Mycroft) could nitpick on the fly from end-user config files.

For instance, Italian uses the long scale, but it's entirely possible that an Italian speaker will prefer the short scale, especially if they're working bilingually. They want to override just the one default parameter. Should it really be the implementation's problem to repeatedly and explicitly pass that information to each parser and formatter? It's particularly problematic for skill authors, who are invoking LF functions by lang code, and relying on Mycroft to know the details, which then relies on LF.

krisgesling commented 4 years ago

Short scale is only for insecure rich people who need to feel even richer than they already are. It should be abolished! **Ducks from all the Americans**

But back to the real issue. Would an example be en_AU where technically the long scale is officially correct, but many people use the short scale because of TV (or something)?

Would this need to be changed on the fly or only when a language is loaded? Presumably this would override the normal default for each method, but if it's intentionally passed in a method call then the Skill (or code) that called it can override the user defined config? eg:

  1. lang default
  2. user config
  3. method call ?
ChanceNCounter commented 4 years ago

en_AU and the number scale is a perfect example!

I think the goal is to make it changeable at runtime, but we don't need to get fancy with the data structures. Just provide some getters and setters so that the implementation can fix things up from its config files.

lingua_franca.config.set_number_scale('en_AU', 'short')

In practice, I figure (for example) that Mycroft would call those functions when it imports LF, shortly after loading languages. Some kind of for loop over the relevant config values.

I saw another FOSS assistant project, with some familiar faces :heart:, eyeing Lingua Franca. I don't know if they actually adopted us, but, if so, their config files will surely look different from Mycroft's; as long as we stick to get/sets, that won't matter to LF!

ChanceNCounter commented 4 years ago

And, probably goes without saying, if Mycroft has an intent to fine-tune settings like that, it can just call the same get/setters. After the refactor, depending how we implement the logic, this might, at most, require a quick LF refresh. Probably not.

My thinking throughout all of this is that, no matter how bloated the resource files become, LF should at worst become part of the same progress bar that loads and unloads language modules in a complex project. Switching your Name Brand voice assistant to another language takes up to a few minutes, between the voice models, hotwords, offline vocab, and other stuff that has to load. I'd like implementations to be able to stick us right in that loop without thinking about it.

JarbasAl commented 4 years ago

i think this is a great idea

ChanceNCounter commented 4 years ago

On reflection, probably lingua_franca.config.set('en_AU', 'number_scale', 'short'), or else something dict-based, so that it can be read from JSON or YAML in a sane fashion. It might be burdensome to have a function per config value.

krisgesling commented 4 years ago

Yeah good plan, personally I'd be opting for the dict based:

lingua_franca.config.set('en_AU', {'number_scale': 'short'})

Then any number of attributes can be passed in. Also like that the relationship between key and value is explicit rather than implicit.

ChanceNCounter commented 4 years ago

Discovered: Mycroft is currently wrapping extract_datetime() to pass the user's configured time zone. That's one setting that ought to be exposed so it can be set here.

I think one good metric here might be replacing downstream wrappers with downstream .set() calls.

ChanceNCounter commented 3 years ago

Recently identified candidates for config settings:

ChanceNCounter commented 3 years ago

Consider a decorator enabling e.g. skill authors to ensure certain defaults for snippets of code, then return LF settings to their original state.

That is, decorate a function to specify decimal numbers, and LF is reconfigured to use decimal numbers by default for the duration of the decorated function. Use context in the decorator to ensure that LF settings go back to their initial state afterward. This way, skill authors will be able to fine-tune output even when LF has been abstracted away.