lingui / js-lingui

🌍 📖 A readable, automated, and optimized (3 kb) internationalization for JavaScript
https://lingui.dev
MIT License
4.31k stars 368 forks source link

`lingui extract-experimental` bugs #1797

Open yunsii opened 8 months ago

yunsii commented 8 months ago

Describe the bug

image

To Reproduce

https://github.com/yunsii/lingui-examples-nextjs-swc-1797

Expected behavior


thekip commented 8 months ago

Well, i've found a problem why some files not get processed by extractor. Unfirtunately it's not easy to fix, because it's need to change literally everthying in how this is processed now.

The simple explanation of the process now:

  1. Files bundled together with esbuild, JSX is preserved
  2. Regular JS extractor is applied to the bundles
    1. Macro applied to whole bundle
    2. Extractor extract from all usages as it was written without macro

The problem here is that esbuild while bundling doing a lot of work related to module imports. It renames, deduplicate and etc. Macro is able to understand renamed imports, but macro is also responsible of turning macro input into runtime import. And here where problems appears. Macro is not smart enough to analyze what import already inserted in the file and in what order (the esbuild could generate very complicated setups, because what actually bundlers do)

So after macro expannding some imports got deleted and extractor did not recognize these nodes as belonging to the Lingui Here is the output result after intermediate steps:

Esbuild bundle ```js // src/pages/index.page.tsx import { Plural as Plural2, t as t2, Trans as Trans3 } from "@lingui/macro"; import path from "path"; import Head from "next/head"; // src/components/AboutText.tsx import { Trans } from "@lingui/macro"; function AboutText() { return

Hello, world
Next.js is an open-source React front-end development web framework that enables functionality such as server-side rendering and generating static websites for React based web applications. It is a production-ready framework that allows developers to quickly create static and dynamic JAMstack websites and is used widely by many large companies.

; } // src/components/Developers.tsx import { useState } from "react"; import { Trans as Trans2, Plural } from "@lingui/macro"; function Developers() { const [selected, setSelected] = useState("1"); return

Plural Test: How many developers?

; } // src/components/Switcher.tsx import { useRouter } from "next/router"; import { useState as useState2 } from "react"; import { msg } from "@lingui/macro"; import { useLingui } from "@lingui/react"; var languages = { en: msg`English`, sr: msg`Serbian`, es: msg`Spanish` }; function Switcher() { const router = useRouter(); const { i18n: i18n2 } = useLingui(); const [locale, setLocale] = useState2( router.locale.split("-")[0] ); function handleChange(event) { const locale2 = event.target.value; setLocale(locale2); router.push(router.pathname, router.pathname, { locale: locale2 }); } return ; } // src/pages/index.page.tsx import styles from "../styles/Index.module.css"; // src/utils.ts import { i18n } from "@lingui/core"; import { useRouter as useRouter2 } from "next/router"; import { useEffect, useState as useState3 } from "react"; async function loadCatalog(locale, pathname) { if (pathname === "_error") { return {}; } const catalog = await import(`@lingui/loader!./locales/src/pages/${pathname}.page/${locale}.po`); return catalog.messages; } // src/pages/index.page.tsx import { useLingui as useLingui2 } from "@lingui/react"; var getStaticProps = async (ctx) => { const fileName = __filename; const cwd = process.cwd(); const { locale } = ctx; const pathname = path.relative(cwd, fileName).replace(".next/server/pages/", "").replace(".js", ""); const translation = await loadCatalog(locale || "en", pathname); return { props: { translation } }; }; var Index = () => { useLingui2(); return
{ /* The Next Head component is not being rendered in the React component tree and React Context is not being passed down to the components placed in the . That means we cannot use the component here and instead have to use `t` macro. */ } {t2`Translation Demo`}

{"Welcome to "} Next.js!

Plain text

{t2`Plain text`}

Next.js {" say hi."}

{"Wonderful framework "} Next.js {" say hi."}

{"Wonderful framework "} Next.js {" say hi. And "} Next.js {" say hi."}


; }; var index_page_default = Index; export { index_page_default as default, getStaticProps }; ```
Bundle after Macro ```js import path from "path"; import Head from "next/head"; // src/components/Developers.tsx import { useState } from "react"; function Developers() { const [selected, setSelected] = useState("1"); return

; } // src/components/Switcher.tsx import { useRouter } from "next/router"; import { useState as useState2 } from "react"; import { useLingui, Trans } from "@lingui/react"; var languages = { en: /*i18n*/ { id: "lYGfRP", message: "English" }, sr: /*i18n*/ { id: "9aBtdW", message: "Serbian" }, es: /*i18n*/ { id: "65A04M", message: "Spanish" } }; function Switcher() { const router = useRouter(); const { i18n: i18n2 } = useLingui(); const [locale, setLocale] = useState2(router.locale.split("-")[0]); function handleChange(event) { const locale2 = event.target.value; setLocale(locale2); router.push(router.pathname, router.pathname, { locale: locale2 }); } return ; } // src/pages/index.page.tsx import styles from "../styles/Index.module.css"; // src/utils.ts import { i18n } from "@lingui/core"; import { useRouter as useRouter2 } from "next/router"; import { useEffect, useState as useState3 } from "react"; async function loadCatalog(locale, pathname) { if (pathname === "_error") { return {}; } const catalog = await import(`@lingui/loader!./locales/src/pages/${pathname}.page/${locale}.po`); return catalog.messages; } // src/pages/index.page.tsx import { useLingui as useLingui2 } from "@lingui/react"; var getStaticProps = async ctx => { const fileName = __filename; const cwd = process.cwd(); const { locale } = ctx; const pathname = path.relative(cwd, fileName).replace(".next/server/pages/", "").replace(".js", ""); const translation = await loadCatalog(locale || "en", pathname); return { props: { translation } }; }; var Index = () => { useLingui2(); return
{/* The Next Head component is not being rendered in the React component tree and React Context is not being passed down to the components placed in the . That means we cannot use the component here and instead have to use `t` macro. */} {i18n._( /*i18n*/ { id: "HjmF2U", message: "Translation Demo" })}

Next.js!"} components={{ 0: }} />

{i18n._( /*i18n*/ { id: "pOo4Aa", message: "Plain text" })}

Next.jssay hi."} components={{ 0: }} />

Next.jssay hi."} components={{ 0: }} />

Next.jssay hi. And <1>Next.jssay hi."} components={{ 0: , 1: }} />


; }; var index_page_default = Index; export { index_page_default as default, getStaticProps }; ```

Note how code of src/components/Developers.tsx appears without actual runtime imports. The imports somehwere below in the listing, but not in that scope of visibility. That the reason why extractor didn't pick up them.

Solution which i'm actually thinking of is to change order of processing in the following way:

  1. Write a plugin for esbuild which would invoke SWC + Rust version of Macro for each file before they get processed by esbuild.
  2. get the bundles
  3. feed them to js extractor (no macro processing happend on the final bundles)

That makes process more safe, because macro would process untouched files and would be more future-safe. From other side this defenetely will increase time needed for processing.

Other option could be to implement a separate AST transformation which would hoist all lingui imports to the very top of the file.

thekip commented 8 months ago

continuing developing an idea:

SWC plugin unfortunately was not designed to work in extractor pipeline. It always work in a "production" mode means always remove non-essential properties which actually used by extractor. So some additional work would be required here. But to prove the theory and make a PoC babel version could be used. I'm expecting it to be quite slow, unfortunately.

About hoisting imports - this could fix this exact case and could break in some other, so i would rather skip this option.

yunsii commented 8 months ago

It seems so hard to process, but in our big project seems not find out this question. I just find it when I create the reproduction 😂 So use extract-experimental in production seems not a good idea recently.

On the other hand, how about the first question: Space is losted when plain text follow JSX element under Trans.

BTW, Esbuild bundle related code format is broken.

thekip commented 8 months ago

So use extract-experimental in production seems not a good idea recently.

Yes, it was created as PoC, and actually there not match attention to it, so i could not get enough feedback to make it production ready.

On the other hand, how about the first question: Space is losted when plain text follow JSX element under Trans.

I didn't investigate it, but i think the problem is similar, when code is processed by esbuild it gets reprinted and space might be dropped.

yunsii commented 8 months ago

@thekip Is there any plan to refactor the PoC to make it fully usable?

yunsii commented 8 months ago

Space is losted when plain text follow JSX element under Trans.

The issue also happens to lingui extract somewhere, it's wired.

thekip commented 8 months ago

Actually, some spaces are deleted intentionally. You can read unit-tests to understand the cases. The issue would be if the final line would be different between what left by macro in the code and what was extracted to catalog.

yunsii commented 7 months ago

It seems so hard to process, but in our big project seems not find out this question. I just find it when I create the reproduction 😂 So use extract-experimental in production seems not a good idea recently.

On the other hand, how about the first question: Space is losted when plain text follow JSX element under Trans.

BTW, Esbuild bundle related code format is broken.

How about make extractor extract id directly regardless of whether module imports?

yunsii commented 7 months ago

@thekip I found a another solution, use dependency-tree to get all of entries dependencies, then extract all of the dependencies by AST, no esbuild required.

How about this idea?

thekip commented 7 months ago

I considered that solution in the beginning, even look into that exact library. Solution with esbuild i like more because:

  1. it's doing tree shaking, more chances that catalog will not contain unnecessary strings. If your project uses tons of barrel files (index.{ts,js} with re-exports) this deps tree resolution become useless without tree-shaking.
  2. it's much faster
  3. it's much easier to use, setup and maintenance.

I still think we need to continue with esbuild. The problems mentioned in this issue are fixable, i just don't have a time for that currently.

semoal commented 5 months ago

How far @thekip are we of converting this feature in something stable? Probably if you detail which tasks are pending I can help on my free time :) We noticed that our catalogs are getting large and this splitting by page would be amazing

thekip commented 5 months ago

Hi @semoal i see following tasks:

  1. Move macro step to each file before it would be processed by esbuild. (Right now babel with macro applied at the last step together with extractor plugin on the whole bundle). This could be done quite easily by writing esbuild plugin or taking one from existing.
  2. [Optional] Instead of applying babel and babel's macro version on each file, it's better to apply SWC + lingui-plugin. I expect a noticeable time overhead if point 1 would be implemented, using SWC instead of babel should cut the time. Unfortunately, this requires changes in lingui-swc plugin, since it doesn't create /*i18n*/ comments which are required by extractor.
  3. In some discord discussions someone reported that esbuild trying to resolve scss file, I don't remember how it is implemented now, but i suppose we should mark as "external" every non js/js-like file for simplicity.

I think this 3 is essential to at least unblock following adoption of the extractor.

From my side, i can help with SWC Rust plugin, and can explain how extractor works now.

After basic functionality become stable we can do performance optimizations:

  1. [Multithreading] Run extraction process on each bundle in the separate worker
  2. Move new extractor to separate package (to avoid people download and install esbuild & swc if they don't use it)
thekip commented 4 months ago

An update here:

Extracted message id incorrect when Trans children has plain text follow JSX element, space is losted

I created a separate discussion, please add your ideas there.

Dependency tree related files (AboutText.tsx/Developers.tsx) do not extracted in reproduction, but Switcher.tsx did

This one is fixed in https://github.com/lingui/js-lingui/pull/1867 separate test case was added

thekip commented 4 months ago

Extracted message id incorrect when Trans children has plain text follow JSX element, space is losted

Fixed: https://github.com/lingui/js-lingui/pull/1882

yunsii commented 4 months ago

@thekip Thanks for your great work, I will try lingui extract-experimental tomorrow, and alse waiting for https://github.com/lingui/js-lingui/pull/1882

yunsii commented 4 months ago

@thekip Unfortunately, with 4.8.0-next.1, lingui extract-experimental does not works with calling currying function directly, like below:

  const result = curringFoo()()
  console.log('curringFoo', result)

Throw error:

SyntaxError: /tmp/js-lingui-extract-cEKHmj/src/pages/index.page.jsx: Unsupported macro usage. Please check the examples at https://lingui.dev/ref/macro#examples-of-js-macros.
 If you think this is a bug, fill in an issue at https://github.com/lingui/js-lingui/issues

 Error: Cannot read properties of undefined (reading 'isExpression')
  101 | var Index = () => {
  102 |   useLingui2();
> 103 |   const result = curringFoo()();
      |                  ^^^^^^^^^^^^^^
  104 |   console.log("curringFoo", result);
  105 |   return <div className={styles.container}>
  106 |     <Head>
    at File.buildCodeFrameError (/home/my/projects/lingui-examples-nextjs-swc-1797/node_modules/.pnpm/@babel+core@7.23.2/node_modules/@babel/core/lib/transformation/file/file.js:205:12)
    at NodePath.buildCodeFrameError (/home/my/projects/lingui-examples-nextjs-swc-1797/node_modules/.pnpm/@babel+traverse@7.23.2/node_modules/@babel/traverse/lib/path/index.js:101:21)

Reproduction: https://github.com/yunsii/lingui-examples-nextjs-swc-1797/blob/0949ea46481695df7d24056efe98c98dbef6e691/src/pages/index.page.tsx#L39-L40

But it works after I try to call currying function step by step like this:

  const curryingBar = curringFoo()
  const result = curryingBar()
  console.log('curringFoo', result)
thekip commented 4 months ago

Interesting, thanks for report, that probably related to useLingui and this refactoring I will take a look.

chrischen commented 4 months ago

I use .mjs files but in imports I omit the file extension such as import foo from "./bar"; and extract-eperimental will give 'Could not resolve "./bar"' errors.

thekip commented 3 months ago

https://esbuild.github.io/api/#resolve-extensions

// lingui.config.ts

 experimental: {
    extractor: {
      /// ...
      resolveEsbuildOptions: (options: import('esbuild').BuildOptions) => {
        options.resolveExtensions = ['.ts', '.js', '.jsx', '.tsx', '.mjs'];
        return options;
      }
    },
  },
chrischen commented 3 months ago

It's extracting messages from some of the detected pages but for some reason this one it fails. There is no Event.jsx (just an Event.mjs file). It's the same as the other files which are being extracted. Getting rid of the Event.mjs file removes the error and the other files still extract fine so it must be something within my Event.mjs file (changing the name doesn't do anything).

Cannot process file ../../../../var/folders/x3/yd9k6w0x2b1b9z5j6grkmll80000gp/T/js-lingui-extract-91fcwL/src/components/pages/Event.jsx /var/folders/x3/yd9k6w0x2b1b9z5j6grkmll80000gp/T/js-lingui-extract-91fcwL/src/components/pages/Event.jsx: Cannot read properties of undefined (reading 'name')
TypeError: /var/folders/x3/yd9k6w0x2b1b9z5j6grkmll80000gp/T/js-lingui-extract-91fcwL/src/components/pages/Event.jsx: Cannot read properties of undefined (reading 'name')

I also would like to take the opportunity to propose an alternate strategy to build-time resolution of dependencies. Currently I use Relay, and I have to compose data requirements from smaller components up until the final page component where it collects all the data requirements. But Relay doesn't auto-detect the data requirements at the page level, but rather has the developer statically define the data dependencies of the sub-components for every level of composed component.

For example a UserProfilePage component may compose of UserName, UserEmail, etc, and UserName and UserEmail define the data requirements it needs, and UserProfilePage you define (via GraphQL fragments) the sum of the data requirements of its constituent components.

I attempted a similar strategy with Lingui where each component file defines its own loadMessages function which will load the message bundle given the locale, and then call the lingui api to load it and merge the messages, and return this as a promise. So for example UserName and UserEmail indicate that they need to load some ../../locales/en/User[Name/Email].ts file and merge the messages into the Lingui context. The UserProfilePage component then composes the requirements of these two child components, and then when the page route is loaded it executes all its loadMessage functions (provided as an array of Promises). The UserProfilePage render function then uses React Suspense to wait for the message bundles to load first.

This actually works fine with current Lingui as is. I just tell the Lingui extractor to split the message bundles into one file per component. The only problem is that there are duplicate translations in the .po files. I don't think it's an issue that there are duplicate entries in the compiled .ts files, but duplicate entries in .po files means it's tough to translate without deduplication.

Is there a way to have it generate a merged .po file for a translator to use, and then during compile separate that back into individual per-component .ts files? If not, what do you think of supporting this feature to support component-based bundle loading?

thekip commented 3 months ago

Getting rid of the Event.mjs file removes the error and the other files still extract fine so it must be something within my Event.mjs file (changing the name doesn't do anything).

There is something in the bundle itself, if you can share it (maybe privately) i will fix it.

Regarding your strategy - that's sounds interesting, I think that could be an option for some people. Would be nice if you can share it with community.

I also use relay on my project, and i understand what you're talking about. However, I'm not sure that this approach would suit all devlopers. Relay, thanks to graphql language and fragment masking (and types if you use TS) make developer experience on very high level, unfortunately achieve the same experience with lingui would be problematic. Without that support, working with your strategy would lead to a lot of user-errors on the subsequent iterations. What if you forgot to add a dependency into a parent component? (in relay you realize this quickly, thanks to types and masking). What if you delete a component, but forgot to delete if from dependency? (relay has an eslint plugin which analyzes graphql fragment and code and show an error in this case)

I don't think it's an issue that there are duplicate entries in the compiled .ts files, but duplicate entries in .po files means it's tough to translate without deduplication.

Any TMS will help you with this. Deduplicating should happen on the level of TMS, not on the level of individual files.

The same challenge would be with experimental extractor as well, messages might be duplicated between pages.

chrischen commented 3 months ago

Yes I agree the problem would arise where we forget to call some loadMessages() of a child component and the best case the translation never shows or worst case it doesn't render and permanently Suspends.

Actually here is the file. It is generated by the ReScript compiler so it's not hand written JS.

// Generated by ReScript, PLEASE EDIT WITH CARE

import * as RelayEnv from "../../entry/RelayEnv.mjs";
import * as EventRsvps from "../organisms/EventRsvps.mjs";
import * as Core__Option from "@rescript/core/src/Core__Option.mjs";
import * as Core from "@linaria/core";
import * as RelayRuntime from "relay-runtime";
import * as ViewerRsvpStatus from "../organisms/ViewerRsvpStatus.mjs";
import * as ReactRouterDom from "react-router-dom";
import * as JsxRuntime from "react/jsx-runtime";
import * as EventQuery_graphql from "../../__generated__/EventQuery_graphql.mjs";
import * as RescriptRelay_Query from "rescript-relay/src/RescriptRelay_Query.mjs";
import * as AppContext from "../layouts/appContext";
import * as RescriptRelay_Mutation from "rescript-relay/src/RescriptRelay_Mutation.mjs";
import * as EventJoinMutation_graphql from "../../__generated__/EventJoinMutation_graphql.mjs";
import * as EventLeaveMutation_graphql from "../../__generated__/EventLeaveMutation_graphql.mjs";

import { css, cx } from '@linaria/core'
;

import { t } from '@lingui/macro'
;

var convertVariables = EventQuery_graphql.Internal.convertVariables;

var convertResponse = EventQuery_graphql.Internal.convertResponse;

var convertWrapRawResponse = EventQuery_graphql.Internal.convertWrapRawResponse;

var use = RescriptRelay_Query.useQuery(convertVariables, EventQuery_graphql.node, convertResponse);

var useLoader = RescriptRelay_Query.useLoader(convertVariables, EventQuery_graphql.node, (function (prim) {
        return prim;
      }));

var usePreloaded = RescriptRelay_Query.usePreloaded(EventQuery_graphql.node, convertResponse, (function (prim) {
        return prim;
      }));

var $$fetch = RescriptRelay_Query.$$fetch(EventQuery_graphql.node, convertResponse, convertVariables);

var fetchPromised = RescriptRelay_Query.fetchPromised(EventQuery_graphql.node, convertResponse, convertVariables);

var retain = RescriptRelay_Query.retain(EventQuery_graphql.node, convertVariables);

var EventQuery = {
  Operation: undefined,
  Types: undefined,
  convertVariables: convertVariables,
  convertResponse: convertResponse,
  convertWrapRawResponse: convertWrapRawResponse,
  use: use,
  useLoader: useLoader,
  usePreloaded: usePreloaded,
  $$fetch: $$fetch,
  fetchPromised: fetchPromised,
  retain: retain
};

var convertVariables$1 = EventJoinMutation_graphql.Internal.convertVariables;

var convertResponse$1 = EventJoinMutation_graphql.Internal.convertResponse;

var convertWrapRawResponse$1 = EventJoinMutation_graphql.Internal.convertWrapRawResponse;

var commitMutation = RescriptRelay_Mutation.commitMutation(convertVariables$1, EventJoinMutation_graphql.node, convertResponse$1, convertWrapRawResponse$1);

var use$1 = RescriptRelay_Mutation.useMutation(convertVariables$1, EventJoinMutation_graphql.node, convertResponse$1, convertWrapRawResponse$1);

var EventJoinMutation = {
  Operation: undefined,
  Types: undefined,
  convertVariables: convertVariables$1,
  convertResponse: convertResponse$1,
  convertWrapRawResponse: convertWrapRawResponse$1,
  commitMutation: commitMutation,
  use: use$1
};

var convertVariables$2 = EventLeaveMutation_graphql.Internal.convertVariables;

var convertResponse$2 = EventLeaveMutation_graphql.Internal.convertResponse;

var convertWrapRawResponse$2 = EventLeaveMutation_graphql.Internal.convertWrapRawResponse;

var commitMutation$1 = RescriptRelay_Mutation.commitMutation(convertVariables$2, EventLeaveMutation_graphql.node, convertResponse$2, convertWrapRawResponse$2);

var use$2 = RescriptRelay_Mutation.useMutation(convertVariables$2, EventLeaveMutation_graphql.node, convertResponse$2, convertWrapRawResponse$2);

var EventLeaveMutation = {
  Operation: undefined,
  Types: undefined,
  convertVariables: convertVariables$2,
  convertResponse: convertResponse$2,
  convertWrapRawResponse: convertWrapRawResponse$2,
  commitMutation: commitMutation$1,
  use: use$2
};

var sessionContext = AppContext.SessionContext;

function $$Event(props) {
  var query = ReactRouterDom.useLoaderData();
  var match = usePreloaded(query.data);
  var match$1 = use$2(undefined);
  var commitMutationLeave = match$1[0];
  var match$2 = use$1(undefined);
  var commitMutationJoin = match$2[0];
  return Core__Option.getOr(Core__Option.map(match.event, (function ($$event) {
                    var __id = $$event.__id;
                    var onJoin = function (param) {
                      var connectionId = RelayRuntime.ConnectionHandler.getConnectionID(__id, "EventRsvps_event_rsvps", undefined);
                      commitMutationJoin({
                            connections: [connectionId],
                            id: __id
                          }, undefined, undefined, undefined, undefined, undefined, undefined);
                    };
                    var onLeave = function (param) {
                      var connectionId = RelayRuntime.ConnectionHandler.getConnectionID(__id, "EventRsvps_event_rsvps", undefined);
                      commitMutationLeave({
                            connections: [connectionId],
                            id: $$event.__id
                          }, undefined, undefined, undefined, undefined, undefined, undefined);
                    };
                    return JsxRuntime.jsx(ReactRouterDom.Await, {
                                children: JsxRuntime.jsxs("div", {
                                      children: [
                                        JsxRuntime.jsxs("h1", {
                                              children: [
                                                (t`Event:`),
                                                " ",
                                                Core__Option.getOr(Core__Option.map($$event.title, (function (prim) {
                                                            return prim;
                                                          })), null)
                                              ]
                                            }),
                                        JsxRuntime.jsx("div", {
                                              className: Core.cx("grid", "grid-cols-1", "gap-y-10", "sm:grid-cols-2", "gap-x-6", "lg:grid-cols-3", "xl:gap-x-8")
                                            }),
                                        JsxRuntime.jsx(ViewerRsvpStatus.make, {
                                              onJoin: onJoin,
                                              onLeave: onLeave,
                                              joined: true
                                            }),
                                        JsxRuntime.jsx(EventRsvps.make, {
                                              event: $$event.fragmentRefs
                                            })
                                      ],
                                      className: "bg-white"
                                    }),
                                resolve: query.messages,
                                errorElement: "Error loading"
                              });
                  })), JsxRuntime.jsx("div", {
                  children: "Event Doesn't Exist"
                }));
}

var LoaderArgs = {};

function loader(param) {
  var params = param.params;
  var url = new URL(param.request.url);
  Core__Option.getOr(params.lang, "en");
  var after = url.searchParams.get("after");
  var before = url.searchParams.get("before");
  return ReactRouterDom.defer({
              data: Core__Option.map(RelayEnv.getRelayEnv(param.context, import.meta.env.SSR), (function (env) {
                      return EventQuery_graphql.load(env, {
                                  after: after,
                                  before: before,
                                  eventId: params.eventId,
                                  first: 20
                                }, "store-or-network", undefined, undefined);
                    }))
            });
}

var make = $$Event;

var $$default = $$Event;

var Component = $$Event;

export {
  EventQuery ,
  EventJoinMutation ,
  EventLeaveMutation ,
  sessionContext ,
  make ,
  $$default ,
  $$default as default,
  Component ,
  LoaderArgs ,
  loader ,
}
/*  Not a pure module */
thekip commented 3 months ago

Thanks for the reply, you need to send this file "Cannot process file ../../../../var/folders/x3/yd9k6w0x2b1b9z5j6grkmll80000gp/T/js-lingui-extract-91fcwL/src/components/pages/Event.jsx /var/folders/x3/yd9k6w0x2b1b9z5j6grkmll80000gp/T/js-lingui-extract-91fcwL/src/components/pages/Event.jsx: Cannot read properties of undefined (reading 'name')"

This would be the bundle crated from Event.mjs.

chrischen commented 3 months ago

That file doesn't exist, and there are no reference to .jsx or .tsx. The other files that extract fine are compiled via the same process. I can send those if you'd like. JSX is not used, and neither is TSX. I am pretty much working exclusively with the non-JSX react files like that .mjs file using function calls. That's why I am confused why it's trying to pull the Event.jsx file and only in that case.

thekip commented 3 months ago

It's because this file is created by underlying esbuild and stored in temporary folder. It's created with jsx extension. It does not exists because cli was able to clean up it even in case of error.

Would you be minded to patch the cli sources in node_modules and repeat?

You need in this file @lingui/cli/dist/lingui-extract-experimental.js delete this line:

    // cleanup temp directory
    await promises_1.default.rm(tempDir, { recursive: true, force: true });
thekip commented 3 months ago

@chrischen by the way, if you are trying to feed code generated by rescript to lingui extractor and this code looks like that what you posted, lingui extractor would not be able to extract strings from JSX elements such as <Trans> because actual jsx is already traspiled to function calls. I'm not familiar with rescript, but if there is an option to preserve JSX, that would help.

chrischen commented 3 months ago

Hi,

I am using the t function call macro instead since Rescript indeed cannot preserve JSX.

I do the patch and try again tomorrow at work and report back. I'm also now getting some extraction errors in normal mode as well with certain .mjs files but not others.

Cannot process file src/components/organisms/EventRsvps.mjs src/components/organisms/EventRsvps.mjs: Cannot read properties of undefined (reading 'name')
TypeError: src/components/organisms/EventRsvps.mjs: Cannot read properties of undefined (reading 'name')

Thanks

On Mar 14, 2024, at 7:54 PM, Timofei Iatsenko @.***> wrote:

@chrischen https://github.com/chrischen by the way, if you are trying to feed code generated by rescript to lingui extractor and this code looks like that what you posted, lingui extractor would not be able to extract strings from JSX elements such as because actual jsx is already traspiled to function calls. I'm not familiar with rescript, but if there is an option to preserve JSX, that would help.

— Reply to this email directly, view it on GitHub https://github.com/lingui/js-lingui/issues/1797#issuecomment-1997167010, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAA2YPVD4FYHXNBPL3L7OKDYYF6XVAVCNFSM6AAAAAA6VVMT3SVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSOJXGE3DOMBRGA. You are receiving this because you were mentioned.

yunsii commented 3 months ago

@thekip Unfortunately again. extract-experimental with TypeScript Enum, generated msgid and msgstr will use the Enum value, other than a variable. Here is my repro commit: https://github.com/yunsii/lingui-examples-nextjs-swc-1797/commit/fcc181661ee075438adc2d06e2ddbabb82685fec

Code:

<div>{t`Hello ${Test.Foo}`}</div>
<div>{t`Hello ${Test.Bar}`}</div>

PO:

#: src/pages/index.page.tsx:54
msgid "Hello {bar bar}"
msgstr "Hello {bar bar}"

#: src/pages/index.page.tsx:53
msgid "Hello {foo}"
msgstr "Hello {foo}"
yunsii commented 3 months ago

@thekip Any plan to fix it?

thekip commented 3 months ago

@yunsii unfortunately I realized that the strategy i went with, when bundle everything and then apply macro + extractor will not work properly even with all changes I did past few months.

In simple cases that might work, but in more complicated cases like you showed that might be broken.

Any bundler might rename / restructure identifier references, and that would cause mismatch between runtime code an extraction. Simple example is:

import { t } from "@lingui/macro";
import { name } from "./names";

t`Hello ${name}` // extracts to "Hello {name}"

Could be bundled ang changed to:

// after esbuild
import { t } from "@lingui/macro";
import _names1 from "./names";

t`Hello ${_names1.name}` // extracts to "Hello {0}"

So the only way to go is to process every file by macro before it gets to the bundler.

The RsPack now supports swc plugins, and i'm considering to change underlying esbuild to RsPack with lingui SWC plugin, to avoid intercommunication between runtimes which would be the case with esbuild (esbuild <-> js <-> babel / swc)

yunsii commented 3 months ago

Wow, it seems really hard. But why lingui extract works well?

thekip commented 3 months ago

Because it doesn't take any preprocess step (esbuild) and consume sources directly.

AndrewIngram commented 3 months ago

RsPack sounds like a good approach, I was thinking along similar lines except using Turbopack.

yunsii commented 3 months ago

I have to wait the brand new approach 😂

AndrewIngram commented 2 months ago

@thekip if you need any extra hands on this, let me know. We're currently in the uncomfortable position of only really being able to use the tree-based approach, but buckling under the extremely slow extraction times in our codebase. We may end up trying to build our own extractor as an interim measure anyway.

thekip commented 2 months ago

@AndrewIngram yes i need an extra hands on this. Could you measure what exactly consuming a lot of time? I think the slowest parts would be parsing the bundles with babel and following extracting. Using tree-approach the same files could end up in many bundles and would be parsed many times compared to just file-by-file approach.

I see there few options:

AndrewIngram commented 2 months ago

Right now our setup is to have one entry point (we generate a dummy page that imports all the other pages), we do this because ultimately we think one message bundle per page will be too granular, so we're starting from the other end.

We've overridden a lot of the esbuild config to:

This got the baseline extraction times down to something reasonable, i.e. when there's only a handful of strings it's fairly quick. But as we've been instrumenting things, it's been getting progressively slow. I've dived into where the bottleneck is, and it's the babel-plugin-macro step, of the 1 minute extract, it consistently takes about 55 seconds.

One thing I tried yesterday was to use swc to perform that step, before feeding it into babel for the final string extraction. But there's no way to configure swc to preserve JSX, so it was only able to pick up non-JSX strings -- though it did seem to be faster. If the final extraction plugin was able to operate on the post-JSX version of the AST, I think this would work -- it would also mean an RSpack-based extractor might would be viable, because using a 2nd babel transform for the final step doesn't appear to be a bottleneck.

thekip commented 2 months ago

and it's the babel-plugin-macro step, of the 1 minute extract, it consistently takes about 55 seconds.

How did you measure it? I just want to be sure that this is caused exactly by this plugin and not by the parsing by babel in general.

thekip commented 2 months ago

You also could try with version from next branch, this version does not use babel-plugin-macro for macro, and instead implementing a good old plugin. I don't believe the babel-plugin-macro add significant overhead itself, rather transformations in lingui macro could be slow, but worth to try.

AndrewIngram commented 2 months ago

How did you measure it? I just want to be sure that this is caused exactly by this plugin and not by the parsing by babel in general.

I wasn't too scientific, I was just shoving logs in at various points in the process and this was the step that took most of the time. But I can try and get something more concrete

AndrewIngram commented 2 months ago

I hit the wall I was expecting with Rspack -- Can generate the tree-shaken bundle with the swc plugin used, but as there's no way to get swc to preserve JSX, I can't feed it into babel for the final extraction step.

thekip commented 2 months ago

Yes, Rspack is not the way at least now. The SWC plugin has to be changed in some way to work in extraction mode (it doesn't have a 100% feature parity with babel's version). Also, it could be modified to pick up non-macro Trans components and transform them into Message descriptor so they would be picked up by extractor even if the JSX is not preserved (something like this for reference)

Anyway, after you discovered that SWC could not preserve JSX , I think going in the way of RSPack would be quite long path, and i don't have a bandwidth for that now. Let's stick with esbuild and babel. Could you check on your project does all this time caused by babel parsing or by macro plugin itself? You can alter the sources in the node_modules and simply comment out the macro while left babel itself.

AndrewIngram commented 2 months ago

Without the Macro, the babel step takes ~3.5 seconds, with it, it takes ~66 seconds.

thekip commented 2 months ago

Wow, ok. Let's check than, does this change https://github.com/lingui/js-lingui/pull/1867 helps? (you can build a next branch by yourself following these instructions.

The problem is, i don't have a big enough project to play with. If you can point me to the one of the opensource projects which may reproduce the performance issue, i will try to debug performance on my own.

semoal commented 2 months ago

Wow, ok. Let's check than, does this change #1867 helps? (you can build a next branch by yourself following these instructions.

The problem is, i don't have a big enough project to play with. If you can point me to the one of the opensource projects which may reproduce the performance issue, i will try to debug performance on my own.

https://github.com/bluesky-social/social-app contains a lingui.config.js, probably is not hard to set-up

AndrewIngram commented 2 months ago

Wow, ok. Let's check than, does this change #1867 helps? (you can build a next branch by yourself following these instructions. The problem is, i don't have a big enough project to play with. If you can point me to the one of the opensource projects which may reproduce the performance issue, i will try to debug performance on my own.

https://github.com/bluesky-social/social-app contains a lingui.config.js, probably is not hard to set-up

Bluesky isn't using the experimental extractor though, so it's unlikely they have any individually huge files to transform. In my case the bundle file is an absolute beast, approx 25mb, even though the number of instrumented strings is still relatively low (approx 500).

AndrewIngram commented 2 months ago

Wow, ok. Let's check than, does this change #1867 helps? (you can build a next branch by yourself following these instructions.

The problem is, i don't have a big enough project to play with. If you can point me to the one of the opensource projects which may reproduce the performance issue, i will try to debug performance on my own.

On 4.7.1

Catalog statistics for pages/_i18nEntry.tsx:
┌────────────────┬─────────────┬─────────┐
│ Language       │ Total count │ Missing │
├────────────────┼─────────────┼─────────┤
│ de-DE          │     472     │   472   │
│ en-US (source) │     472     │    -    │
│ en-GB          │     472     │   472   │
│ en-PL          │     472     │   472   │
│ fr-FR          │     472     │   472   │
│ ja-JP          │     472     │   15    │
└────────────────┴─────────────┴─────────┘

pnpm --filter=app-dashboard lang:extract  128.83s user 9.58s system 134% cpu 1:42.79 total

On Next

Catalog statistics for pages/_i18nEntry.tsx:
┌────────────────┬─────────────┬─────────┐
│ Language       │ Total count │ Missing │
├────────────────┼─────────────┼─────────┤
│ de-DE          │     472     │   472   │
│ en-US (source) │     472     │    -    │
│ en-GB          │     472     │   472   │
│ en-PL          │     472     │   472   │
│ fr-FR          │     472     │   472   │
│ ja-JP          │     472     │   15    │
└────────────────┴─────────────┴─────────┘

pnpm --filter=app-dashboard lang:extract  15.20s user 2.39s system 198% cpu 8.874 total

So yeah, you could call that an improvement 😅

AndrewIngram commented 2 months ago

Use esbuild just to crawl and invoke extractor on every file esbuild process (write a plugin). This will open a possibility to make cache, if the file was already processed it wouldn't be processed once again. But this will eliminate tree shaking, so resulting catalogs may have more messages than you're expecting (if you use barrel index.js files approach this may hurt you).

I'm going to explore this avenue more -- i.e an esbuild plugin that conditionally invokes babel, even if it involves a performance hit at the bundling stage, it feels like the only way to mitigate all the other issues people have raised -- short of rewriting everything for swc.

AndrewIngram commented 1 month ago

Okay, in my project i've cobbled together a custom tree shaking extractor, that works as follows:

Despite adding the babel macro plugin to the esbuild step, moving it out of the final extractor resulting in a net perf gain of approximately 5x (on one package, the extraction time goes from 160 seconds to 35 seconds).

This also fixes many of the issues people have flagged with the experimental extractor (strings not getting extracted, being extracted incorrectly, or interpolated variables having the wrong names).

I expect to see even faster perf if I switch to the babel plugin that's not built on babel-plugin-macros

Another idea that might further boost perf is to run the extractor plugin during esbuild too, and then filter the messages to just those whose ID is found in the final generated bundle file (given the pseudorandom ID format, just grepping the file should be sufficient -- no 2nd babel pass needed).

thekip commented 1 month ago

Yep, that is what i exactly wanted to do. Move macro per-file basis. Would you be minded to share your solution in a PR? I than could help you to finalize it to make it merged. You can DM me in discord if you have any questions about contributing.

I also patched @lingui/babel-plugin-extract-messages to change localTransComponentName to be a set of any seen Trans component import aliases

This also fixed in the next branch, in the same PR which introduces standalone macro plugin.

Another idea that might further boost perf is to run the extractor plugin during esbuild too, and then filter the messages to just those whose ID is found in the final generated bundle file (given the pseudorandom ID format, just grepping the file should be sufficient -- no 2nd babel pass needed).

That might not work for non-macro usecases because they have to provide ids by themselves. But the idea could be expanded. For example extractor could annotate extracted symbols with specific mark (say generate an ID) and then do what you said with grepping,