Open sarayourfriend opened 1 week ago
@WordPress/openverse-frontend what do y'all think of this proposed change?
I admittedly don't have a strong opinion about this change. I use a VS Code extension called i18n Ally that allows for many features, but the key ones are:
I did check and this plugin works fine with the flattened keys :+1:
That looks like a nice extension! I suppose for core maintainers using VSCode, that extension solves the issue of searching back and forth. Whether something similar exists for JetBrains IDEs (which I understand to be the other one some core maintainers use) I don't know. It doesn't solve the issue for new contributors, even if they use VSCode, who are less likely to have extensions like that one installed, or if you'd rather not need to install yet another extension just for finding things (and which may only be relevant to Openverse).
This is a great idea to help frontend contributors. It will also simplify the code to convert json to pot and vice versa.
I tried searching for problems that could be caused by having a period in a JSON key, and the only thing I could find is if we have both the "a.b" key ({"a.b": value1 }
), and a nested object like {a: { b: value2 } }
^1. So, it's important to only have the non-nested properties.
Problem
Our current vue/i18n messages files utilise nested JSON objects to discriminate keys. A sample from our
en.json5
looks like this:Code that references these keys does so in a dot-delimited path format. For example,
hero.disclaimer.content
references the key at that path in the nested object.While nested objects may provide a slight advantage to authorship of the messages file, it presents a severe disadvantage when trying to find the message from a key in the code referencing it. Using the example above, if you wanted to see the message associated with the key
hero.disclaimer.content
, and tried to search the codebase for that literal string, you would not find it. Instead, you would have to know to navigate to theen.json5
, and then find thehero
key, find itsdisclaimer
key, and then finally thecontent
key. Sometimes you can shortcut this by searching for just the final part of the key, in this casecontent
. However, we have 31 keys that end with the segment.content
, so searchingcontent:
in the file would still require looking through individual instances to find it. The example above also uses relatively small and shallow nesting and does not illustrate the additional difficulty of navigating deeply nested keys in large collections of messages, likesensitive.designations.userReported.title.description.a
.Description
Instead, I propose we "flatten" the messages objects where the keys are the full path to the message. In other words, remove all nesting.
For example, the messages excerpt above would turn into this, instead:
This format is backwards compatible with our existing messages objects. If you replace the 404 and hero objects with the flattened version above and run your local frontend, there is zero issue.
The benefit of this approach is that it is easily searchable in both directions. From the messages file, it is easier to find uses of the keys. From runtime code, it is easier to find the content of the translation string. Additionally, our
json-to-pot
script could be simplified, as right now it has to collapse keys when converting to POT, becausePOT
is a flat-format.Our POT-to-JSON conversion re-explodes the keys into the nested object. We should retain this behaviour in the final output messages files. A meaningful downside to the flat format is an increase in the total character size of the keys, due to the repeated strings.
The nested example minifies to 509 characters (expand for minified version).
```json {"404":{"title":"The content you’re looking for seems to have disappeared.","main":"Go to {link} or search for something similar from the field below."},"hero":{"subtitle":"Explore more than 800 million creative works","description":"An extensive library of free stock photos, images, and audio, available for free use.","search":{"placeholder":"Search for content",},"disclaimer":{"content":"All {openverse} content is under a {license} or is in the public domain.","license":"Creative Commons license"}}} ```The flattened example minifies to 527 characters (expand for minified version).
```json "404.title":"The content you’re looking for seems to have disappeared.","404.main":"Go to {link} or search for something similar from the field below.","hero.subtitle":"Explore more than 800 million creative works","hero.description":"An extensive library of free stock photos, images, and audio, available for free use.","hero.search.placeholder":"Search for content","hero.disclaimer.content":"All {openverse} content is under a {license} or is in the public domain.","hero.disclaimer.license":"Creative Commons license"} ```Because this change is proposed to improve authorship (not transport or anything else relevant to the final generated files), and because it is backwards compatible with the nested format, we should retain the nested format for the produced JSON files.
To implement this change, we will need to flatten the
en.json5
's keys. This can be done by hand, or using something like this online JSON flattening tool, except that tool and others like it strip comments and make other unwanted transformations, so would still require manual changes to address those... Becausejson-to-pot
already has to flatten the keys for use in the POT files, it shouldn't be too hard to adapt our existingjson-to-pot
script into ajson-to-flattened-json
script that preserves comments, single-quoted strings (which we use to avoid needing to escape double quotes in some strings), etc. I would recommend that approach, it seems like the least tedious option to me!