cakephp / twig-view

Twig View for CakePHP
MIT License
14 stars 7 forks source link

i18n extract from .twig-files #91

Open fabian-mcfly opened 1 year ago

fabian-mcfly commented 1 year ago

I'm not quiet sure if this belongs here or if it should be part of the cakephp repo.

Currently cakephp's i18n extract-command only looks for and reads files with the php-extension (https://github.com/cakephp/cakephp/blob/8b5d4b65ca63478bb525042eb95071d6749b5c30/src/Command/I18nExtractCommand.php#L839)

This leads to missing translatable strings from .twig-files when using this twig-view-package.

Simply adding the twig-extension in the lookup is not sufficient because the _extractTokens()-method of that command uses php's token_get_all()-function (see https://github.com/cakephp/cakephp/blob/8b5d4b65ca63478bb525042eb95071d6749b5c30/src/Command/I18nExtractCommand.php#L431)

The two main questions I have:

  1. How to parse the twig templates files and look for calls to the i18n functions?
  2. How to add that functionality to the command line? Should this package provie a standalone extract command? Would it be possible to extend cakephp's I18nExtractCommand?
fabian-mcfly commented 1 year ago

And since I don't just want to point my finger, I did some testing.

1. Using Twig directly

I tinkered a bit with Twig since there are already multiple token parsers that could give a good result. But it turns out that this might be a little bit too complicated.

$filePath = '/path/to/a/twig/file';
/** @var \Cake\TwigView\View\TwigView $view */
$view = $this->viewBuilder()->build();
$twig = $view->getTwig();

$templateWrapper = $twig->load($filePath);
$moduleNode = $twig->parse($twig->tokenize($templateWrapper->getSourceContext()));

function listFunctionCalls ($node, array &$list, $twig) {
  if (!$node) {
    return;
  }

  if ($node instanceof \Twig\Node\Expression\FunctionExpression) {
    $name = $node->getAttribute('name');
    if (in_array($name, ['__', '__d'])) {
      $parameters = [];
      $list[] = [
        'name' => $name,
        'parameters' => listFunctionParameters($node, $parameters),
      ];
    }
  }

  foreach ($node as $child) {
    listFunctionCalls($child, $list, $twig);
  }
}

function listFunctionParameters($node, &$arguments) {
  /** @var \Twig\Node\Node $child */
  foreach ($node as $child) {
    if ($child instanceof \Twig\Node\Expression\ConstantExpression) {
      $arguments[] = $child->getAttribute('value');
    }

    if ($child::class === 'Twig\Node\Node') {
      listFunctionParameters($child, $arguments);
    }
  }

  return $arguments;
}

$functions = [];
listFunctionCalls($moduleNode, $functions, $twig);

dd($functions);

(based on https://stackoverflow.com/questions/32614432/how-can-i-analyze-twig-templates-without-rendering-them)

A file with contents

{{ __('Übersetzung von "Erstellen"') }}
{{ __d('menus', 'Übersetzung von "Erstellen" in Domain "menus"') }}

will result in

[
  0 => [
    'name' => '__',
    'parameters' => [
      0 => 'Übersetzung von "Erstellen"',
    ],
  ],
  1 => [
    'name' => '__d',
    'parameters' => [
      0 => 'menus',
      1 => 'Übersetzung von "Erstellen" in Domain "menus"',
    ],
  ],
]

It will ignore parameters that use concatenation or other expression, like {{ __d('menus', 'Übersetzung von "Erstellen" ' ~ variable ~ ' in Domain "menus"') }}. This could lead to issues and/or unexpected results.

fabian-mcfly commented 1 year ago

2. Overwriting the functions & loading the view

I also tried this approach where I use a special view class

class I18nView extends TwigView {
  protected array $functions = [
    '__',
    '__d',
  ];

  public function initialize (): void {
    parent::initialize();

    $twig = $this->getTwig();

    foreach ($this->functions AS $functionName) {
      $twigFunction = new \Twig\TwigFunction($functionName, [$this, $functionName]);
      $twig->addFunction($twigFunction);
    }
  }

  public function __ (string $singular, ...$args) {
    //Do something with $singular
  }

  public function __d (string $domain, string $msg, ...$args) {
    //Do something with $domain and $msg
  }
}

Those methods could be used to remember all passed arguments so that another method, used in the extract command, could access them.

I first thought that would be smart approach but this has too many potential issues.

fabian-mcfly commented 1 year ago

3. Don't care

You all got your pitchforks and torches? Get the file contents and just replace {{ and {% with <?php and to the same for the closing tags. After that, just run token_get_all like before. It works! :D (I'm sorry)

For those who dare: https://onlinephp.io/c/fca0b

markstory commented 1 year ago

Thanks for putting together a proposal and doing the homework. It is greatly appreciated. :clap: While the tokenization and parsing solution is complex it is the most durable long term solution, as Twig's tokenization API is quite stable. If you put together a pull request with how far you've gotten we can add tests and get a collection of usage scenarios with tested support. If we miss a scenario in the future it can be fixed without regressions.

markstory commented 1 year ago

Just noticed I didn't answer all your questions.

I'm not quiet sure if this belongs here or if it should be part of the cakephp repo.

I think this repository is a better fit as it has the twig dependency. I think having a separate command name is reasonable given this is a plugin.

ADmad commented 1 year ago

It will ignore parameters that use concatenation or other expression..

This limitation exists for the parsing of PHP templates too, so I don't it's a problem. Accounting for expressions would be too complicated.

fabian-mcfly commented 1 year ago

Oh thank you! 😄 I will try to provide a PR!

I came across an issue when using a standalone extract command: it's very likely that a domain.pot-file already exists (created by the main i18n extract-command). Simply overwriting it will make people lose translatable strings from their controllers. Merging/appending isn't really an option, is it?

In my head, the best case scenario would be to have the base command offer some way to add additional parser classes, which would return items to be used in _addTranslation. This could be used by other plugins to provide extractors for other filetypes as well. Does something similar exist anywhere in cakephp already?

markstory commented 1 year ago

Simply overwriting it will make people lose translatable strings from their controllers. Merging/appending isn't really an option, is it?

Merging pot files should be possible as message strings act as identifiers.

This could be used by other plugins to provide extractors for other filetypes as well. Does something similar exist anywhere in cakephp already?

Not yet.

fabian-mcfly commented 1 year ago

Merging pot files should be possible as message strings act as identifiers.

It could lead to old, unused strings still being part of the *.pot-files. Should I care about that?

markstory commented 1 year ago

It could lead to old, unused strings still being part of the *.pot-files. Should I care about that?

I don't think so. Users can remove old pot strings manually, or regenerate the pot files from scratch if they want to remove old content.