robisim74 / qwik-speak

Translate your Qwik apps into any language
https://robisim74.gitbook.io/qwik-speak/
MIT License
131 stars 15 forks source link

Feat: autoassign-keys #110

Closed genox closed 6 months ago

genox commented 8 months ago

Hi there,

When I add qwik-speak to my project (when I add the vite plugin), I get the following error during build:

✓ 2640 modules transformed.

Qwik Speak Inline warn
There are missing values or dynamic keys: see ./qwik-speak-inline.log
Cannot read properties of null (reading 'module')
error during build:
TypeError: Cannot read properties of null (reading 'module')
    at setAlternativeExporterIfCyclic (file:////node_modules/rollup/dist/es/shared/node-entry.js:13528:18)
    at Module.getVariableForExportName (file:////node_modules/rollup/dist/es/shared/node-entry.js:12944:17)
    at getVariableForExportNameRecursive (file:////node_modules/rollup/dist/es/shared/node-entry.js:12594:19)
    at Module.traceVariable (file:///node_modules/rollup/dist/es/shared/node-entry.js:13194:35)
    at ModuleScope.findVariable (file:////node_modules/rollup/dist/es/shared/node-entry.js:11613:39)
    at ReturnValueScope.findVariable (file:////node_modules/rollup/dist/es/shared/node-entry.js:5913:38)
    at FunctionBodyScope.findVariable (file:////node_modules/rollup/dist/es/shared/node-entry.js:5913:38)
    at Identifier.bind (file:////node_modules/rollup/dist/es/shared/node-entry.js:7195:40)
    at CallExpression.bind (file:////kilometro/node_modules/rollup/dist/es/shared/node-entry.js:4641:23)
    at CallExpression.bind (file:///kilometro/node_modules/rollup/dist/es/shared/node-entry.js:8832:15)

The error disappears when I comment out the qwikSpeakInline vite plugin.

I tried to rollback all the libs to the one used in this repo's root package.json and from node 20 back to node 18 just to rule this out but that gives the same result. Hard to figure out what's going on, the project should be well set-up and prod build runs flawless without this plugin.

The qwik-speak-inline.log is never written to disk, it looks like node bails before the stream can be written.

When I check out this repo and run a build, it runs through flawlessly as well, so it is probably my setup and code.

Do you have an idea what might cause this?

Edit: Just to be precise, I configured the project according to the guides provided in this repo and running qwik in dev mode works fine, strings are displayed and language switching works.

genox commented 8 months ago

Just a shot in the dark, but could it be that keys with dashes cause issues? like app.some-key-with-dashes@@string

robisim74 commented 8 months ago

@genox Yes, keys with dashes are not fully supported with dynamic imports, you should use valid variable names, or switch to raw import: see https://github.com/robisim74/qwik-speak/blob/main/docs/translation-functions.md#dynamic-import

But this should not be the cause of the error you are getting.

From what you report, yes it seems the log fails before the stream can be written: https://github.com/robisim74/qwik-speak/blob/main/packages/qwik-speak/tools/inline/plugin.ts#L229

To isolate the problem, you could run the build command for the server bundles only: npm run build.preview

If the command generates server bundles correctly, it could just be a problem writing the log to the root of the folder: some permission problem? what OS do you use?

genox commented 8 months ago

@robisim74 I tried running the build steps on their own, the server build runs fine, the error occurs in build.client.

I'm on macOS 14.2.1 with node 20.10.0, yarn 4.0.2 and workspaces. File permissions seem fine, can't find any odd permission.

I noticed that yarn installs a newer rollup version than the one in this repo's lock file so I locked down that version in resolutions to try if that might be the issue but this didn't work either. I did the same to all other dependencies as well but no luck.

Anything I can add to your plugin's .js that might be helpful in debugging? I had a look at the source and the minified js and was trying to find out where it bugs out but I realised there are no direct rollup calls, correct? so it probably is a return of some hook that is not what vite, or rollup, expects. 🤔

robisim74 commented 8 months ago

Anything I can add to your plugin's .js that might be helpful in debugging?

yes, try comment the whole method async closeBundle() and let me know

robisim74 commented 8 months ago

Please, also try to build only client chunks: npm run build.client and check in dist/build folder if chunks are generated (one for each language)

genox commented 8 months ago

Oh, I think I found the issue.

While I was migrating back from compiled-i18n (long story.. 😭 ) I was making a mistake while burning the midnight oil. I used t() outside of qwik context in a few component files for form validation etc. with valibot... I was rushing massively for a demo today.. The component below the t() tried to reference the out-of-scope t() .. that apparently causes the module handler to bail.

Ah, man .. 😵‍💫

On a sidenote, one reason I was switching to compiled-i18n was that I ended up with a huge chaos in terms of keys or sharing keys and a substantial mental overhead in trying to remember which key I can reuse and wether or not that should be in a common namespace etc. At about 200 keys things get very chaotic. I was used to the i18next approach where it was common to use the text as key, which has multiple advantages from my point of view. No need to remember things, easily parseable into typescript for autocompletion, etc.

I switched back because I had build issues with compiler-i18n that broke qwik and I needed a working solution, no matter what. I wrote a script that automatically assigns UUIDs to strings and handles duplicates as well, which is actually quite convenient to work with. I just add a string with '@@' prefix and the rest is handled by the build process. No need to remember namespaces etc.

Not sure, I guess everyone has their own strategy but for me, this seems like an interesting approach. Duplicates are automatically picked up since I parse all existing keys before changing the source. I can share the script, let me know if this even could be an extractor feature.

Thank you for your help.

robisim74 commented 8 months ago

Glad to hear you solved the issue!

Always open to new interesting features if they can improve the developer's work. It would be great if you would share your script, and illustrate with a small example what it does. And it could of course become an option for the extractor

genox commented 8 months ago

I ended up doing it like this on my current project and it immediately remedied my pet peeve, which is assigning keys, tracking namespaces and duplicate strings.

1) I use qwik-speak normally, but instead of adding keys in a new component, I add t('Some string').

2) stage all git changes

3) run my script that performs these actions:

4) run qwik-speak extraction

Some obvious issues with this approach:

I will share the code later.

robisim74 commented 8 months ago

Maybe it will be possible to use the library's existing parser, and resolve the points 1, 3 and 4

genox commented 8 months ago

Yes, this is true. I was considering that but for the sake of making a poc quickly, I ended up starting from scratch (probably spending more time than reusing your parser in the end, but it's the journey, not the destination, lol). I will checkout qwik-speak this weekend if I have the opportunity and see if I can integrate the core essence of my idea into a cli command next to the extractor that reuses the existing parser. something like qwik-speak-autoassign-keys. I would also like to match existing strings to existing uuids in the translation files to avoid invalidating entire translation sets.

I think that it should be possible to mix manual keys with autogenerated ones, which would probably solve the dynamic keys issue as well.

robisim74 commented 8 months ago

Maybe you can avoid creating another command, and add an option to the extraction tool like: autoassignKeys: if the option is true, you could extract the keys in the pipeline of extraction tool after readSourceFiles: https://github.com/robisim74/qwik-speak/blob/main/packages/qwik-speak/tools/extract/index.ts#L275

In parseSourceFile you can see how the parser is used: https://github.com/robisim74/qwik-speak/blob/main/packages/qwik-speak/tools/extract/index.ts#L84

And here the parser (that I built specifically for this library, but it's the journey, not the destination, lol) https://github.com/robisim74/qwik-speak/blob/main/packages/qwik-speak/tools/core/parser.ts

In this way it would be enough to launch a single command and obtain both the self-assignment of the keys and the extraction

robisim74 commented 8 months ago

Even better: if the method for generating the key was based on the string value, the same option could be added to the Inline tool, with the following advantages:

genox commented 8 months ago

That's a great idea. I was initially not trying to stray away too much from the given path with how qwik-speak handles things now, to lower the risk of breaking something, but considering extending the inline translate hook as well opens up this option.

The reason why I stuck with (the first fragment of) a UUIDv4 was the size of the key. I was considering an MD5 hash via crypto but that adds 32bytes per key. Not sure how this turns out performance wise if people have large sets of translations and they download them asynchronously. Might also be a case of premature optimisation and not worth the hassle.. if the json is gzipped (which it hopefully is), this should amount to nothing.

robisim74 commented 8 months ago

The size is not important if we do not add it to the source code: the json files are loaded only the first time server side and then memoized, while client side the translations are inlined and only the runtimeAssets are loaded (with manual keys).

What we need is a simple and fast method for generating keys, with not mutch code and without third-party dependencies, because then we should also add it to the runtime translation used server side and for dynamic keys: https://github.com/robisim74/qwik-speak/blob/main/packages/qwik-speak/src/core.ts#L109

genox commented 8 months ago

MD5 is readily available in the crypto module on all recent node versions (and in the browser afaik) and decently fast. I'm not an edge function enthusiast so I have no idea about the availability in runtimes other than node. I also have no way of testing that other than for this feature. But such a basic module certainly should have been adopted.

The only other option is to make an own hash function.

Are you in the loop on Vercel edge etc.?

robisim74 commented 8 months ago

I vote for our function. To begin with I asked chat gpt: link

Ok, we should make it smaller, or put it in a separate module so that it is tree shaked from the runtime library when the new feature is not used (for tools it is not a problem however)

genox commented 8 months ago

That looks like jshashes https://github.com/h2non/jshashes so instead of reinventing the wheel i would simply include that as a dependency.

On the other hand, every major edge function provider has a crypto module implementation (the web crypto api as defined by w3c) or at least a sensible subset of it. I did some minor research and it looks fine. I would just go for it. We can still consider a fallback if it comes up and it is probably faster than a plain js implementation in any case.

I'd vote for native crypto and if this is shaky, fallback to jshashes.

edit. or maybe not: https://stackoverflow.com/questions/76503154/hashing-passwords-with-vercel-edge-adapter-in-qwik-city

😕

robisim74 commented 8 months ago

The point is this: in the tools (extract and inline) we can use whatever we want, because they are executed at compile time.

In the qwi-speak runtime library we can perform key generation only server side because in the client side chunks the translations are inlined (using isServer the generation code would be removed from the client side chunks).

Native crypto maybe will work, and should be generally supported: https://developer.mozilla.org/en-US/docs/Web/API/Crypto?retiredLocale=it#browser_compatibility also by Vercel Edge: https://vercel.com/templates/next.js/edge-functions-crypto

We can try: in the meantime we could try a custom function for generating uuids to see if the process works, and then try to replace it with crypto

robisim74 commented 8 months ago

My bad, why didn't I think of this before? We can completely avoid key generation in qwik-speak and just use the tools:

This way we can use what we want to generate the keys (native crypto is ok for me) and we don't have problems with size and library compatibility

genox commented 8 months ago

I will give this a shot today. Currently wrapping my head around the library, setting things up and trying to get a mental map of what's going on. I will also have a look at outdated dependencies while I'm at it.

Did you consider using prettier or .editorconfig? Looks like my default IDE formatter is already starting to mess things up.

robisim74 commented 8 months ago

Don't waste time with dependencies (I usually update them as needed before each release).

Uhm... I use VS Code... and I don't like prettier... However, the only rules to be respected are indent space 2 and the typescript and eslint rules

Style is not important, the most important rule is clean code

genox commented 7 months ago

just a quick update. I managed to extend the extraction and the inline translation to work fine (with the test page, anyways). It is backwards compatible and it tries to be smart about which keys are meant to be generated and which not and I have a few ideas to improve this further. I tried different approaches and refactored it 10 times but I guess I have a decent understanding of how qwik-speak works now.

In order to handle plurals, I extended the plural function with defaultValues (something that makes sense in conjunction with auto keys, in my opinion). this works basically, up until extraction and the first render. but once a value changes, I end up with an undefined key because the hash function isn't run again and it is opaque as to what actually happens if a value increased from 0 to 1. getValue seems to not run again, yet the previous key is lost. Existing plural keys (the ones from the demo/test page plural test) are not affected. I don't understand what's happening.. or not happening. How is are the strings referenced, once getValue ran?

I pushed my WIP here: https://github.com/genox/qwik-speak/tree/auto-keys

I also tried using native crypto but somehow this makes it into the client bundle and after trying multiple ways of handling this in an isomorphic way I ended up using a standard SHA1 hash gen function .. instead of MD5. It is a fraction of the size, works just as well for this scope and it's good enough.

I will be travelling starting next week for 4 weeks, so I hope I can figure this out beforehand.

edit: oh, I forgot to mention something that confuses me: Somehow the vite import.meta.glob in speak-function.ts behaves not as I would expect. Some keys were missing on the top-level of the imported object and but I found all of them tucked away under the default property, including the top-level keys. Not sure how vite handles all of this or why it decides to do what it does.. but I had to resort to set { import: 'default'}. With this set, things work fine. Without it, half the keys are always missing and I don't know why.

robisim74 commented 7 months ago

Hey, did you read this comment? https://github.com/robisim74/qwik-speak/issues/110#issuecomment-1901974329

We should avoid generating the keys in the runtime library, to avoid increasing the size of the client-side bundles. Instead we should add the auto-generated key automatically in the server-side bundle. This can be done simply from the inline tool, here: https://github.com/genox/qwik-speak/blob/auto-keys/packages/qwik-speak/tools/inline/plugin.ts#L174

Just add:

if (options.autoKeys && target === 'ssr') {
  /* update ssr code with generated key, like `transformTranslate` &&  `transformPlural` */
}

Regarding the inlinePlural, I have to investigate what happens.

About import.meta.glob, keys must be valid variable names so you should change autoKey-${sha1(key)} to autoKey_${sha1(key)} and maybe you could avoid { import: 'default'}

genox commented 7 months ago

We should avoid generating the keys in the runtime library, to avoid increasing the size of the client-side bundles. Instead we should add the auto-generated key automatically in the server-side bundle. This can be done simply from the inline tool, here: https://github.com/genox/qwik-speak/blob/auto-keys/packages/qwik-speak/tools/inline/plugin.ts#L174

Just add:

if (options.autoKeys && target === 'ssr') {
  /* update ssr code with generated key, like `transformTranslate` &&  `transformPlural` */
}

No, sorry, I missed this information but it should be easy to move it there. This might solve the inlinePlural undefined key as well as the keys will theoretically be defined already at this stage.

I will change the variable name and see what happens. I'm not aware that those variable names would be invalid and I can't find any reference that would say otherwise. If it is indeed invalid, this limitation is imposed by Vite's importer. I had a case where one autoKey-... was indeed accessible from the top-level of the object and others weren't.

If this doesn't work, I have to dig deeper.

genox commented 7 months ago

Ok, stupid mistake. Valid JSON does not necessarily translate into valid JS.. sigh.

I had a look at how to integrate auto keys into transformTranslate(). While this removes the auto key handling from inlineTranslate(), it removes the option of having both regular keys and auto generated keys in the same project, unless I miss something. There is also no way of dynamically checking for the existence of a translation or being smart about it.

I found the idea of being backwards compatible and just handling keys automatically where necessary oddly appealing but I can also see the point of trying to do too much and increasing complexity for a limited number of edge cases. I guess that option would be nice, but not necessarily good.

So basically, one has to make a decision at the start of a project whether or not automatic key handling should be enabled or disabled and there is no option to mix and match. It's either this or that.

I will move everything into the plugin and revert the rest.

robisim74 commented 7 months ago

You can use the same logic you used in inlineTranslate into the plugin. You have there the same elements: translation data from json, and every key (extracted from the parser). Therefore you should get the same behavior as you have now

genox commented 7 months ago

Yes that is correct. Thank you for your guidance.

robisim74 commented 7 months ago

Thanks to you, you are doing a great job! if you don't have time to finish it before your trip, don't worry: do the PR and I'll complete it (with any fixes, examples, tests and docs)

genox commented 7 months ago

Ok :)

I have an issue at the moment with the availability of the strings after the vite plugins transform has been called. In the first render, the strings are rendered using the defaultValue from the original markup in the components. There is no replacement happening (at least not in the chunk that is sent to the browser).

Once a re-render is triggered, suddenly the strings show up. Not entirely clear to me what is going on there.

On a page load I can log the replacements server side, auto keys etc, as I would expect but it looks like those code replacements don't make it to the browser. Could this be a qwik streaming thing?

I also noticed that the translation var which is supposed to contain translation sets is empty during build, so any checks for existing keys won't work in the transform method during build. I am not sure I understand the order of which things run fully yet.

I will create a PR tomorrow and I would be glad if you could finalise it.

robisim74 commented 7 months ago

To load translation data also for ssr, you have to add target === 'ssr' here: https://github.com/robisim74/qwik-speak/blob/main/packages/qwik-speak/tools/inline/plugin.ts#L159 or better yet remove the conditions completely (the assets will now be needed on both the client and the server).

Keep in mind that Qwik makes two builds: one with target='ssr', with which it builds the server-side bundle and in which you have to add the autogenerated keys, and another build with target='client' with which it builds the chunks sent to the browser (and in which the the current transform method replaces inlineTranslate with translations) that you don't need to change.

No problem: PR where you're at (even if something's wrong) and I'll finish.

genox commented 7 months ago

Unfortunately I can't get it to work. I created a PR with my current WIP.

I can get the JS to include the correct strings but the transpiled index.html doesn't, which explains why there is the wrong data on first page load.

Honestly I think I don't have enough insight into the inner workings, it's quite complex. And I couldn't work on the plural FN default values either but there is no point spending more time on this for now as I devolved into hours and hours of trial and error which is very inefficient and it probably takes you one hour to get this thing finally flying.. ;-)

The extractor seems to work fine so far though.

robisim74 commented 7 months ago

No problem. Thank you for sharing your idea, and for the work done. I will complete it in the next few days.

genox commented 7 months ago

Thank you!

robisim74 commented 7 months ago

Hi Oliver,

I completed your PR, with a couple of changes:

      <p class="counter">{p(
        zebras.value,
        '{"one": "{{ value }} {{ color }} zebra","other": "{{ value }} {{ color }} zebras"}',
        {
          color: t('black and white')
        }
      )}</p>

I'm sorry you wasted a lot of time, I understand that it wasn't easy to modify the plugin and maybe I wasn't good at explaining it to you: however now you can see what I meant here: https://github.com/robisim74/qwik-speak/blob/main/packages/qwik-speak/tools/inline/plugin.ts#L174

Two new functions are called (both during the server-side bundle build and during the client-side build) which (like the transform ones) simply add the auto-generated keys. After all, everything happens as before.

Thanks again for your contribution!

genox commented 7 months ago

Thank you for fixing my confused code. ;)

The plural feature is much cleaner like that and there is no need to null or undefined the key, which was a bit of an eye sore to me as well. This is much cleaner. Thank you.

No worries. I went into this a little bit overconfident and quickly realised that there is more going on than I can try to understand in the timeframe available to me, without deeply understanding everything. It is not just the extractor and replacing some keys, but also vite plugin architecture plus qwik architecture and I am really just a frontend person building on other people's solutions. I ended up being not very confident in what I did and my efforts devolved into trial and error, so that't the point in time to let someone take over who actually understands what is going on.

I don't think any time is wasted trying to understand something one doesn't understand yet and this is the first time I dug into build tools other than hacking a few scripts for myself, so this was certainly worth the exploration.

I will gladly test a beta when you prefer to release one. Let me know.

robisim74 commented 7 months ago

Released in v0.20.0: https://github.com/robisim74/qwik-speak/releases/tag/v0.20.0