start-again / spookyBot

🎃 A spooky Discord bot
MIT License
9 stars 12 forks source link

Decouple emojis from individual language files to avoid merging overhead #29

Open Mr1BitL8r opened 3 years ago

Mr1BitL8r commented 3 years ago

Feature Request

Is your feature request related to a problem? Please describe. One thing that might become a problem in combination with #10 and #12 is that people for different languages add new emojis in a different language file and order. For example the https://github.com/LucasCtrl/spookyBot/blob/main/lang/en.js is missing the "Muerte" emoji from https://github.com/LucasCtrl/spookyBot/blob/main/lang/es.js

--> The English template file does not contain all possible halloweenish emojis from all language files so it makes life a little bit harder for people to keep track and who just want to add translations instead of searching for new emojis. The problem complexity increases with more languages and emojis being added... As long as there are not that many items it is probably doable for the person accepting a pull request to merge the changes back into the English template file but in the meantime other people might have added a new language and did work on an old template.

Describe the solution you'd like So maybe it is a good idea to decouple the language files from the emojis and just use them for translations and maybe synonyms.

For example somebody could first search for a lot of halloweenish emojis, add them to a template and add a unique word/term to it which can be referenced uniquely in language files or maybe directly use the term as a filename for synonyms/translations and using a folder, e.g. "lang/en" for synonms, "lang/fr" for French translations, "lang/de" for "German", ... . In a file lang/emojis,js you could then store the emojis + a unique word that describes each emoji, currently possible language and maybe have a tool that generates empty template files (e.g. in JSON format) for each possible language - or just have a folder with empty templates that can be copied directly, so e.g. for the "zombie" emoji it would generate a file called "zombie.json" in each of the language folders, so the structure then would look like this: lang/ lang/_languagefiles_templates/candy.json lang/_languagefiles_templates/zombie.json lang/de/candy.json lang/de/zombie.json lang/en/candy.json lang/en/zombie.json lang/fr/candy.json lang/fr/zombie.json lang/nl/candy.json lang/nl/zombie.json [...]

And people would just need to fill in translations or for English just synonyms. If a template file is not filled with data it does not get used by the application in that language. (My assumption there is though that there are less emojis for Halloween to use than possible languages in which the discord bot will be adopted in.) If programmed dynamically it could also easily be possible to based on the folder structure just adding a new language by adding a new folder with those files

Describe alternatives you've considered

  1. Instead of using the above mentioned solution with different folders and individual files it might just be easier to use one translation file for each language (sort of a mini dictionary for mapping the unique term to synonyms and translations).

  2. An alternative solution for solving the problem without a lot of manual merging work is to create an issue in which somebody pre-fills the English template with a lot of potentially Halloween emojis so that people start with (nearly) all the emojis right away and not worry with merging overhead in the future.

Teachability, Documentation, Adoption, Migration Strategy If you can, explain how users will be able to use this and possibly write out a version the docs. Maybe a screenshot or design?

Mr1BitL8r commented 3 years ago

Thinking further about this issue why don't we directly use the names for the template file containing the emojis based on the naming schema presented in: https://emojipedia.org/halloween/

In this way in the language file for English we could set the real text that triggers the emoji but got a more or less universal mapping of emojis searchable by text via the emojipedia website. --> Easier to extend and people who want to translate but do not understand the meaning of an existing emoji could read more about the emoji at that website as well. :)

tmttn commented 3 years ago

Another idea: get rid of the lang files in favor of one big reactions.js file...

Example:

reactions: [
  {
      triggers: [
        {
         lang: 'en',
         terms : ['ghost', 'phantasm', 'spirit']
        },
        {
          lang: 'nl',
          terms: ['spook', 'geest']
        },
        ...
      ],
      emoji: '761602615326146590',
   },
    ...
]

This way we have 1 'dictionary' for reactions.

Mr1BitL8r commented 3 years ago

True. Would decrease the complexity and people who do not speak English could use other mentioned translations as a reference for adding their language.

Depending on how many languages and words/phrases get added the list becomes very long though but at least it is at one place and people do not need to understand the system that I thought about (might be a bit too over-engineered and not user-friendly enough as the files need to be created in a certain manner and people cannot directly see the connection to an emoji).

Mr1BitL8r commented 3 years ago

Thinking further about the limited amount of existing emojis (probably all now for English/German) and maximum number of languages maybe a combined approach of the suggestion for the reactions.js is an idea.

So we might use the abbreviation of the language as a JSON tag to decrease the complexity of the data and mark the existing languages in an array.

module.exports = {
  lang: ['ar', 'ca', 'de', 'en', 'es', 'fr', ...],
  words: [
    {
      // Please do not delete this one 🙏
      ar: ['مرعب'],
      de: ['gruselig'],
      en: ['spooky'],
      fr: ['spooky'],
     uniqueName: '_custom_spooky',
      emoji: '761602615326146590',
    },
    {
      de: ['alien', 'Außerirdischer'],
      en: ['alien'],
      fr: ['alien', 'extraterrestre'],
     uniqueName: 'Alien',
      emoji: '👽',
    },
...

And so on.

In this way we have the advantage that people can

  1. Easily use different languages as an input for their translation,
  2. The emojis are unique, complete and
  3. It might also be easier to load more than one language.

Basically we could use either the emoji (or uniqueName) tag to create an index for a hash map and the second value is an array of words for all the selected different active languages for that emoji (related to #12). If a language was not specified there is just no text entry to look up anything for.

Mr1BitL8r commented 3 years ago

For not being bound to a certain solution as a preparation in the PR #58 I created an Libre/Open Office ODS spreadsheet file with all current emojis and translations so based on this data we could write a macro (VBA, Python or whatever) and have a nice overview over all emojis, languages and can transform the data into our target format.

Mr1BitL8r commented 3 years ago

With PR #59 the Libre/Open Office ODS spreadsheet file with all current emojis and translations was massively improved and a Basic macro added which creates the *.js files in a unified way.