taoensso / tempura

Simple text localization library for Clojure/Script
https://www.taoensso.com/tempura
Eclipse Public License 1.0
260 stars 16 forks source link

Finding missing translations #36

Closed NoahTheDuke closed 2 months ago

NoahTheDuke commented 3 months ago

Problem Statement

Translation dicts can get out of sync, where each language dictionary has an overlapping but not consistent set of keys. This can lead to annoying inconsistencies when updating translation files.

Question

Given:

(def translation-dictionary
  {
   :en
   {:missing ":en missing text"
    :side
    {:corp "Corp"
     :runner "Runner"
     ; :any-side "Any Side"
     :all "All"}}

   :fr
   {:missing ":fr texte manquant"
    :side
    {:corp "Corpo"
     :runner "Runner"
     :any-side "N'importe"
     ; :all "Tous"
     }}})

I'd like to be able to see that [:en :side.any-side] doesn't exist and [:fr :side.all] doesn't exist. This would allow me to get translations fixed faster than waiting on bug reports.

Thanks so much!

ptaoussanis commented 3 months ago

@NoahTheDuke Hi Noah!

Just to confirm my understanding, you're saying that you'd like a little util that:

  1. Identifies the full set of keys present in any dictionary
  2. Identifies which languages may be missing any of those keys

I.e. for your example the util would return:

{:en {:side {:any-side nil},
 :fr {:side {:all nil}}}

Is that right?

If so, that sounds reasonable. I'll get one in next time I'n on batched Tempura work.

In the meantime, is this something you feel comfortable writing yourself - or would you like a hand?

NoahTheDuke commented 3 months ago

I could hack something together for sure, I was more double checking my read of the code that such a utility doesn't exist. It's very funny/annoying to spend time writing something only to find out it already exists and is better lmao.

For my use-case, the :en is our "single source of truth" so being able to see the missing keys of other languages would allow me to periodically check in with the affected communities and say "here's what's missing".

I can give it a swing, there's no rush. We've lasted roughly 6 years without this. 😉

ptaoussanis commented 3 months ago

It's very funny/annoying to spend time writing something only to find out it already exists and is better lmao.

Indeed :-) 👍

so being able to see the missing keys of other languages would allow me to periodically check in with the affected communities and say "here's what's missing".

Makes sense 👍

I can give it a swing, there's no rush.

My first inclination would be to reach for encore/node-paths (undocumented) since that'll help deal with the nesting.

(into #{} (map butlast)
  (node-paths
    (:en ; Source of truth
     {:en
      {:missing ":en missing text"
       :side
       {:corp "Corp"
        :runner "Runner"
        :all "All"}}})))

;; => #{(:missing) (:side :runner) (:side :corp) (:side :all)}

i.e. that gives you key paths for everything in :en, then you could do the same for your non-English dictionaries and see what's missing with set/difference, etc.

Feel free to ping if you run into any trouble!

NoahTheDuke commented 3 months ago

Oh that's clever! I'll check that out, thank you.

NoahTheDuke commented 2 months ago

It worked out perfectly, as seen in the linked PR. Thanks for the help!

ptaoussanis commented 2 months ago

@NoahTheDuke Happy to hear it, thanks for the confirmation!

And super cool that you're involved with a Netrunner project :-) It looks awesome!