boxblinkracer / phpunuhi

PHPUnuhi - The easy composable framework to validate and manage translations
MIT License
72 stars 6 forks source link

The translate command is not designed for large amounts of data (Shopware6) #54

Open momocode-de opened 3 months ago

momocode-de commented 3 months ago

We would like to gradually translate our Shopware 6 shop and have tested the translate command for this purpose. Unfortunately, we have discovered that the command is not designed for a large amount of data.

We have created a configuration file that contains a translation set for each entity that we want to translate. First of all, we only want to translate the translation set for the "product" entity. To do this, we use the "--set" option of the command. This is where the first problem arises:

The $configLoader->load function is executed in the command, which loads the configuration for the selected configuration file. All existing translations of all sets are loaded in this function. This means that the "--set" option passed to the command is not taken into account here. With a large amount of data, loading the existing translations alone takes an extremely long time and loading everything into the memory at once is of course also very bad. Apart from that, it makes no sense to me that loading the existing translations is part of the "ConfigurationLoader". The existing translations should be loaded shortly before the translation process.

After loading the configuration, it is then iterated over each set and only here is the "--set" option taken into account. Then comes the next problem:

The translation for the selected set is then loaded for each item and saved in the memory. Only when all translations have been loaded into the memory are they all saved to the database with a single SQL statement. As we have over 20,000 products, this approach is very impractical. Imagine you load 20,000 translations via openai, for example, and then an error occurs when saving. Then all the API requests were in vain and you may have paid a lot for them.

It would be much better if the existing translations were loaded in small batches (e.g. 50 per batch), then translated and saved directly. I know that some things have to be changed for this, but unfortunately the current procedure cannot be used with a large amount of data.

timeo-schmidt commented 3 months ago

+1 Hi Moritz! Currently also looking into PHPUnuhi for our store. Someone in the Slack channel pointed this issue out and it seems like it could also pose a problem to us... we would like to also add translations for our customized products tables, which have >300k entries. Best regards, Timeo

momocode-de commented 2 months ago

@boxblinkracer Are you working on this topic? If not, we'll have to look for an alternative or program an optimized command ourselves.

boxblinkracer commented 2 months ago

Hi there sorry for the late reply absolutely interested in improving it im on vacation now, but ill get back to you after it :)

thanks for raising the issue :)