opral / monorepo

lix (change control system) && inlang (globalization ecosystem for software built on lix)
https://opral.com
Apache License 2.0
1.25k stars 111 forks source link

[bug] plugin-inlang-json and plugin-inlang-i18next compile namespaces to invalid js #1577

Closed ZerdoX-x closed 7 months ago

ZerdoX-x commented 1 year ago

Problem

Quote from: https://github.com/inlang/monorepo/discussions/1464#discussioncomment-7448011

Yes. You can use any storage format you'd like. For example, the JSON storage plugin https://inlang.com/m/ig84ng0o/plugin-inlang-json, which also reduces the clutter of the inlang message format plugin/is it easier to write translations manually.

The issue is that plugin compiles messages to invalid javascript:

export const global:hello = () => {
    const variants = {
      "en": `world`,
  "ru": `мир`,
  "zh": `世界`
    }
    return variants[languageTag()] ?? "global:hello"
}

Expected behavior

At least valid code generation :)

Reproduction

project.inlang.json

{
    "$schema": "https://inlang.com/schema/project-settings",
    "sourceLanguageTag": "en",
    "languageTags": ["en", "ru", "zh"],
    "modules": [
        "https://cdn.jsdelivr.net/npm/@inlang/plugin-json@4/dist/index.js"
    ],
    "plugin.inlang.json": {
        "pathPattern": {
            "global": "./static/t10s/global/{languageTag}.json"
        },
        "variableReferencePattern": ["{", "}"]
    }
}

static/t10s/global/en.json

{
    "hello": "world"
}
  1. Scaffold project
  2. npx @inlang/cli machine translate
  3. Compile t10s messages, open compiled messages.js

Other information

ZerdoX-x commented 1 year ago

Also seems like lint rules don't play good with these plugins:

> inlang lint

                                                                                                       10:51:51 PM
🚨 Lint Report                                                                                         10:51:51 PM
┌────────────┬───────────────────────────────────┬──────────────────────────────────────────────────┐  10:51:51 PM
│ Level      │ Lint Rule                         │ Message                                          │
├────────────┼───────────────────────────────────┼──────────────────────────────────────────────────┤
│ Warning    │ messageLintRule.inlang.identical… │ Identical content found in language 'ru' with    │
│            │                                   │ message ID 'global_available_language_en_tag'.   │
├────────────┼───────────────────────────────────┼──────────────────────────────────────────────────┤
│ Warning    │ messageLintRule.inlang.identical… │ Identical content found in language 'zh' with    │
│            │                                   │ message ID 'global_available_language_en_tag'.   │
├────────────┼───────────────────────────────────┼──────────────────────────────────────────────────┤

Using the same setup as above. Do you want me to open separate issue for this?

samuelstroschein commented 1 year ago

Yes. You can use any storage format you'd like. For example, the JSON storage plugin https://inlang.com/m/ig84ng0o/plugin-inlang-json, which also reduces the clutter of the inlang message format plugin/is it easier to write translations manually.

Ha. Good observation. Namespaces can't be used for paraglide as there is no need for them and namespace ids are invalid JS. I recommend installing the snake_case_id to lint invalid ids https://inlang.com/m/gkerinvo/messageLintRule-inlang-snakeCaseId

Also seems like lint rules don't play good with these plugins:

That seems to be a bug. I opened https://github.com/inlang/monorepo/issues/1578

samuelstroschein commented 1 year ago

Regarding this issue, you can't use IDs with invalid JS. That is a limitation that is outside of our control. I can improve the error message however. A suggestion on how we can explain the must be valid JS id limitation?

ZerdoX-x commented 1 year ago

Namespaces can't be used for paraglide as there is no need for them and namespace ids are invalid JS

What? :D

I didn't really get what you mean, but.. The original problem here is that me and or any other user would like to split t10s message file somehow, for example just so it could be easier to navigate through t10s and edit them manually.

Here is my thinking process:

  1. inlang-message-format is an official storage plugin which is probably ok to call "default one". By looking at the configuration properties of the plugin you can see that it only supports storing all of your t10s messages in a single file. This is a critical limitation for me, because for now I edit all of my translations manually, using code editor.
  2. In the feedback thread plugin-inlang-json was recommended to me so I can break up messages.json file. So solution to the problem is to just use another storage plugin. Fine, I take a look at configuration settings and see: "you can split messages by language", "you can split messages using namespaces". Nice!
  3. I configure storage plugin but it does not work. Now you say that "Namespaces can't be used for paraglide", then.. why does this setting exists? :thinking: You can somehow ditch paraglide and use other sdk to manage translations? That makes no sense for me.

as there is no need for them

I understand that there is a huge take about web editor and vscode extension, but I though you agreed that t10s files should still be editable by hand with no huge tradeoffs or problems.

and namespace ids are invalid JS

That's just about how paraglide code generates (compiles) message files, isn't it? When I set up the project and tried to compile with namespaces, I was expecting to see something like:

export const global = {
    hello() {
        const variants = {
            "en": `world`,
            "ru": `мир`,
            "zh": `世界`
        }
        return variants[languageTag()] ?? "global.hello"
}

I definitely had to put this into "Expected behavior" :thinking:


Regarding this issue, you can't use IDs with invalid JS

Of course, this is the reason why I opened this issue :)

That is a limitation that is outside of our control

Not sure how is this out of control. There is a setting which allows to use namespaces, however they just don't work. I guess this was caused by breaking changes in paraglide and it's behavior or concept. How this feature worked before?


I just started working on t10s for my project, and here is my WIP en.json which I use with json storage plugin:

en.json ``` { "global__available_language__en_tag": "En", "global__available_language__ru_tag": "Ru", "global__available_language__zh_tag": "中", "global__company__name": "Tonna", "header__sign_in": "Sign in", "header__sign_up": "Sign up", "core_ui__icon__tonna_logo__label": "Logotype of Tonna company", "core_ui__component__select__general_label_default": "Select an option", "core_ui__component__select__trigger__label_default": "Option", "core_ui__component__select__trigger__label_unselected_label": "Select an option", "core_ui__component__carousel__role_description": "carousel", "core_ui__component__carousel__slide__role_description": "slide", "core_ui__component__carousel__slide__label": "%s of %s", "core_ui__component__carousel__action__next_slide": "Next slide", "core_ui__component__carousel__action__prev_slide": "Previous slide", "core_ui__component__carousel__action__first_slide": "Go to first slide", "core_ui__component__carousel__action__last_slide": "Go to last slide", "core_ui__component__carousel__action__start_autoplay": "Start autoplay", "core_ui__component__carousel__action__stop_autoplay": "Stop autoplay", "core_ui__component__carousel__pagination__go_to_slide": "Go to slide %s", "core_ui__component__carousel__pagination__go_to_page": "Go to page %s", "core_ui__component__carousel__pagination__select_slide": "Select a slide to show", "page__company_profile__component__edit__label": "Edit", "page__company_profile__section_overview__profile_picture__label": "Profile picture of {company_name}", "page__company_profile__section_overview__profile_picture__upload": "Upload new photo", "page__company_profile__section_skills__rating_label": "Tonna rating", "page__company_profile__section_skills__industry_list_heading": "Industry", "page__company_profile__section_skills__competencies_list_heading": "Competencies", "page__company_profile__section_skills__capacities_list_heading": "Competencies", "page__company_profile__section_skills__video__upload": "Upload", "page__company_profile__section_skills__video__button_edit_label": "Upload a new video", "page__company_profile__section_products__heading": "Products / services", "page__company_profile__section_products__contact_block__heading": "Contact the company", "page__company_profile__section_products__contact_block__profile_picture__label": "Profile picture of {company_name}", "page__company_profile__section_products__contact_block__button_contact_chat": "Message to chat", "page__company_profile__section_products__contact_block__button_contact_request": "Send a request", "page__company_profile__section_documents__heading": "Certificates", "page__company_profile__section_documents__button_upload_document": "Upload document", "page__company_profile__section_documents__document__preview_label": "Preview of {document_name}", "page__company_profile__section_documents__document__download": "Download document", "page__company_profile__section_documents__document__delete": "Delete document", "page__company_profile__section_contacts__heading": "Contacts", "page__company_profile__section_contacts__button_contact": "Message to chat", "page__company_profile__section_contacts__socials__list_heading": "Socials", "page__company_profile__section_contacts__socials__social__whatsapp": "WhatsApp", "page__company_profile__section_contacts__socials__social__viber": "Viber", "page__company_profile__section_contacts__socials__social__telegram": "Telegram", "page__company_profile__section_contacts__socials__social__weechat": "WeeChat", "page__company_profile__section_contacts__socials__social__vkontakte": "VKontakte", "page__company_profile__section_contacts__socials__social__facebook": "Facebook", "page__company_profile__section_contacts__socials__social__linkedin": "LinkedIn", "page__company_profile__section_contacts__socials__social__reddit": "Reddit" } ```

Here you can see double "_", it may look strange, but this is how I see leveraging namespaces + deep keys. First part before first "__" is namespace and other is just deep keys. For example, in i18next and AFAIR other svelte i18n libraries you can use it like this. Take a look at i18next json format, or react-i18next#358 to get a better understanding of way I would like to story my translations.

Now it may sound like I am requesting another feature "deep keys", but honestly I would just be satisfied to be able to use namespaces, as it would reduce nesting by one level + break up one en.json into multiple files, one for each namespace, which will reduce amount of lines contained in one file.


Also I am not sure, but seems like "storage plugin" is also responsible for "message format". If I understood right, in future this may cause huge tradeoffs for me making this choice now. For example in feature support like "plurals", ICU support and all that i18n logic related stuff which at first glance should not be related to the way you store your messages. I thought that "storage plugin" will be only responsible in storing translations in a way that you like, but not limiting feature support. Shouldn't this be two separate types of plugins? For example I could combo fluent-message-format + yaml-storage and get a "perfect setup"? In my head this is a huge mess now, and I don't understand why would I choose plugin-inlang-json over plugin-inlang-message-format when.. they both json? In my head the only difference is that in first one I don't have huge amount of fields which I don't know what they are responsible for (but maybe I am losing features?) and I can split single messages.json into separate en.json and zh.json files, which is good for me for manual editing and navigating through them.

Also while typing this I think that it would probably affect web editor and ide extension really much and a huge amount of work should be done in order to support deep keys or just namespaces. Now I don't understand anything even more, if I'll implement my own storage plugin, web editor won't be able to access files?

Also, just for contrast, this is how we stored t10s in my previous project We didn't had any professional translators and whole team members were just editing messages manually, with no web UIs. But AFAIR we could even translate this amount of files through i18n-ally vscode extension, if member used vscode. ![photo_2023-11-03_02-20-58](https://github.com/inlang/monorepo/assets/49815452/b6c7ae8d-2ae4-4bef-9273-4324b4bdcc3a) Yes, this is manual implementation of namespaces and deep keys :D I mean deep keys were supported out of the box, we just wanted to store them in separate files.

I described my experience and to sum up I just would love to see namespaces support (as they should work I guess? there is a setting for this). That's all for now and I don't want to take too much attention from the main work, such as first adapters for paraglide, or paraglide itself.

And sorry for THAT amount of text, but I hope you can understand me. I just want to give as much useful feedback as I can. Also sorry if maybe I used some weird expressions/phrases or made mistakes, English is my secondary language yet :)

Thank you for your attention :heart:

ZerdoX-x commented 1 year ago

I just started working on t10s for my project, and here is my WIP en.json which I use with json storage plugin

Maybe I am misusing or misunderstand how devs should or should not organize keys (ids) for their translations? In one team I saw t10s files each file for each language contained 1k+ lines with contents like:

{
  "Roadmap": {
    "firstParagraph": "Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.",
    "secondParagraph": "Blah-blah-blah",
    .......
    "thirtyThirdParagraph": "Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Morbi tristique senectus et netus. Sed cras ornare arcu dui vivamus arcu felis bibendum. Sagittis orci a scelerisque purus. Blandit aliquam etiam erat velit. Sit amet massa vitae tortor condimentum lacinia. Varius morbi enim nunc faucibus a pellentesque sit amet. Viverra orci sagittis eu volutpat odio. Ipsum dolor sit amet consectetur adipiscing elit. Nec sagittis aliquam malesuada bibendum."
  }
}

And no one really cared. What do you think about all of this?

ZerdoX-x commented 1 year ago

Oh... I now got what you meant by

Regarding this issue, you can't use IDs with invalid JS

Could you please take a look at "reproduction" once again? I am not using any invalid js, just plain valid json. But plugin and paraglide-js generate invalid js. The issue is just about valid input, valid configuration of plugins => leads to invalid compiled .js

So there was no sense in writing that huge texts. You just misunderstood me and/or the issue :D

samuelstroschein commented 1 year ago

@ZerdoX-x Thanks again for the elaborate messages. I am in the moving to NYC process. Excuse my longer-than-usual reply times.

It is key to understand that a) inlang is an ecosystem of products, plugins, and tools that work together but b) each product is "its own product". The inlang SDK/platform has no concept of namespaces and neither does paraglide-js (which uses the inlang SDK to compile messages). The plugin-json, which provides the messages for the SDK, however, does have the concept of namespaces, unbeknown to the SDK and Paraglide-JS. The plugin JSON resolves namespaces with . in the message ids. Usually, that is not an issue. Due to paraglide's compiled nature, every message id needs to be a valid JS variable name. The namespace resolution of the plugin json breaks the valid JS pattern.

In short, the way the plugin json (unbeknown to paraglide-js) resolves namespaces, breaks JS and therefore paraglide-js.

We take the currently strong stance that namespacing is a redundant concept that exists for legacy manual optimization reasons. Introducing namespacing in the SDK to be used by paraglide-js would lead to a lot of additional complexity which has no benefit other than developers splitting their translation files manually. Due to tree-shaking this splitting is not required. In large projects, developers don't even touch the translation files.

For example in feature support like "plurals", ICU support and all that i18n logic related stuff which at first glance should not be related to the way you store your messages. I thought that "storage plugin" will be only responsible in storing translations in a way that you like, but not limiting feature support.

All of those feature are related to how they are stored :) Without storing features like variants, pluralization, etc, those features can't be implemented. We have a discussion going to rename "storage plugin" to import and export plugin. I created an issue https://github.com/inlang/monorepo/issues/1585. Your feedback is appreciated whether the proposal makes it easier to understand why the namespace exports are not working.

I definitely had to put this into "Expected behavior" 🤔

The moment you nest (namespace) messages in an object, you break tree-shaking and will end up with a potentially large bundle size. Hence, the following leads to unexpected behavior.

export const global = {
    hello() {
        const variants = {
            "en": `world`,
            "ru": `мир`,
            "zh": `世界`
        }
        return variants[languageTag()] ?? "global.hello"
}

Maybe I am misusing or misunderstand how devs should or should not organize keys (ids) for their translations? In one team I saw t10s files each file for each language contained 1k+ lines with contents like:

Don't organize them ;)

The requirement of organizing messages in different files is usually to lower the bundle sizes. A flat key-value storage schema would lead to massive bundle sizes. The opposite is true for paraglide-js. A flat file that exports JS functions is tree-shakable by a bundler, auto-optimizing the bundle size in a manner that no manual namespacing could.

In short, I strongly recommend a flat snake_case storage structure, and you should be good to go.

ZerdoX-x commented 1 year ago

Hmm.. Okay. This made it clearer, thank you for the explanation.

So.. why this setting even exists in plugin json if it doesn't applicable to inlang and paraglide?

samuelstroschein commented 1 year ago

I updated the inlang message format (v2) to by human editable, hopefully resolving this issue because the need for the json and i18next plugin does not exist anymore. See https://inlang.com/m/reootnfj/plugin-inlang-messageFormat

ZerdoX-x commented 12 months ago

I can see namespaces options have been removed in both of mentioned plugins. I guess, issue can be closed. It's no longer possible to cause invalid code generation using regular settings :)

samuelstroschein commented 12 months ago

@ZerdoX-x namespace should still be supported. We are waiting for #1585 to be completed before we close this issue

herr-ahman commented 9 months ago

From my perspective, this suggestion primarily aims to improve the DX. Utilizing a single .json file that contains hundreds of keys significantly deteriorates DX, especially for those who do not use Visual Studio Code along with the Inlang plugin. I wonder if it would be feasible to introduce namespacing by directories within the pathPattern directory, as shown below:

"plugin.inlang.messageFormat": {
    "pathPattern": "./src/json/{languageTag}.json"
}

the directory then becomes the namespace within the compiled message?

So for example having a "title" key in this file: ./src/json/auth/en.json

turns into .src/json/en.json:

...
"auth_title": "Authentication setup"
...

in the root en.json which then compiles into messages.js? Just a thought that you guys most probably already considered, and I guess there's tons of other reasons why this wouldn't work...

samuelstroschein commented 9 months ago

@herr-ahman what you describe can be implemented by importer/exporter plugins with #1585.

But, we won't recommend it, and likely won't implement in plugins maintained by us. Naming keys is too flaky and will break at several points, no matter how good inlang tries to mitigate the problems. We will introduce #1892 which hopefully puts an end to (excuse me haha) requests like yours that stem from naming stuff. You can read the docs about 1892 here.