Closed maennchen closed 1 year ago
@maennchen how do other Gettext implementations (for other languages) tackle MO files?
@whatyouhide Most (C based) implementations I‘ve used so far only read the .mo for runtime translations. The .po / .pot is only used for extraction / to help with merge problems.
@maennchen can we close this now that Expo supports MO files?
@whatyouhide That was the next issue i wanted to tackle:
Support .mo in Gettext itself. I think (we have to benchmark it) it makes compilation faster.
@maennchen got it, makes sense. This would require a slightly different workflow for Gettext entirely, right? We'd have to dump POs and MOs, and read MOs if present, falling back to PO? Do you have an exact workflow in mind? I ask because I have some cycles I can dedicate to Gettext 😉
@whatyouhide I wanted to make the file handling strategy configurable (at least at the start to prevent breaking changes)
Do you envision MOs being committed in version control? Is this the flow used by GNU Gettext, if you know?
@whatyouhide I intend to commit them.
Gettext itself has no opinion about mo files in vCS as far as I‘m aware of.
I know from the PHP ecosystem that in most cases mo files are committed. I also have experienced opinions that those should not be committed and is only added on demand / for releases.
Speaking for myself: I would commit them and would not be concerned about conflicts in .mo files since you can always regenerate them from merged .po files.
Because there seem to be different opinions about this, I wanted to implement it as a configurable strategy so that people can decide how they want to handle it.
(I closed this by accident, sorry about that!)
My guess would be that these files should not be committed, as essentially they're a duplicated "cache" of PO files anyways. I’m ok with configuration, but I'd like to keep simplicity as much as possible. For example, before diving into this, I'd ask: does Gettext compilation take significant time today? Are we sure introducing MO files, which increases complexity, is worth it?
@whatyouhide
In a bigger application like https://github.com/jshmrtn/hygeia, the parsing of the .po
file takes around 0.2s per language on my machine. If the performance comparison of https://github.com/elixir-gettext/expo/issues/21 is still more or less accurate, potentially around 75% of the time could be saved. (~ 0.8s)
I think the generating of the functions inside the backend takes longer though compared to the actual parsing. So maybe having a look at that performance would make a bigger difference.
An even bigger impact is the compile time dependency of all the modules using the gettext backend. Changing one translation currently means recompiling most of the applications it is used in.
I think committing .mo
files is ok. Most people are also committing .pot
files even though they're technically just cached extractions. Depending on the project, the line of how much we want to "cache" can be different. In bigger applications, I might want to make the trade-off and in a quick demo project not.
I also don't seem to be alone with this opinion. There are currently over 132 million checked-in .mo files on GitHub: https://github.com/search?l=&q=extension%3Amo&type=code
An even bigger impact is the compile time dependency of all the modules using the gettext backend. Changing one translation currently means recompiling most of the applications it is used in.
Yeah, especially because changing a translation does not change the code generated at compile-time.
Considering the added complexity of supporting MO files, I'd definitely shift our focus on the compile-time dependencies and function generation, yeah.
Discussion moved to #330
Great, thanks @maennchen. I will close this for now then, and we can reopen in case this comes up again. Thanks! 💟
Expo supports parsing / writing
.mo
files, which are a lot faster to read since it is a simple binary format.I would like to support it here as well.
Proposed changes:
.po
files or if they want to use the.mo
..mo
file is also written.mo
file is loaded instead of the.po
.po
file is newer than the.mo
file, I would log a warning for the user to update the.mo
since they likely edited the.po
by hand and forgot to callmerge
again