thunderer / Shortcode

Advanced shortcode (BBCode) parser and engine for PHP
http://kowalczyk.cc
MIT License
378 stars 29 forks source link

UTF-8 unicode problem with PHP 8.3 #112

Closed vmario89 closed 4 months ago

vmario89 commented 5 months ago

Dear Shortcode developers, since i upgraded to the latest release of PHP 8.3 my Grav instance will not render some pages anymore. At first i thought it belongs to the plugin markdown-collapsible, which triggers it. But it's developer says it comes from Shortcode.

Could you maybe have a look at https://github.com/X-Ryl669/grav-plugin-markdown-collapsible/issues/8#issuecomment-2143991948 ?

for Umlauts like äüö it will fail with PCRE error value 4, because preg_match does not seem to use /u unicode support. Is there an option or safe way to implement this?

The alternative is to wrwite ü escaped chars but it is not convenient writing in this way to omit those rendering problems :/

regards, Mario

thunderer commented 5 months ago

@vmario89 thank you for reporting the issue, I read into the linked Grav issue but I don't see you using shortcodes there? Can you share a minimal text and expected result to reproduce the issue? Thanks!

vmario89 commented 5 months ago

Hi, the shortcode is the collapsible field starting with !> and ending with !@. the content between has Umlauts and does not render, instead fails with the error. The errors throws basically in an class/method of Shortcode which is in user/plugins/shortcode-core/vendor/thunderer/shortcode/src/Parser/RegularParser.php

!> Übersicht anzeigen ...
| Nr | Name |
| --- | --- |
| 17  | Gerhard Werner |
!@
thunderer commented 5 months ago

@vmario89 I can't reproduce the PCRE error on PHP 8.3 and the latest Shortcode version, it would help if you can provide a minimal script for that. Please check RegexBuilderUtility::buildNameRegex() method, though, and try replacing the regex inside with just \w+. Parsers restrict the valid shortcode name to [a-zA-Z0-9-_\\*]+, and that will exclude Übersicht. If you try the script below it should not apply any changes (shortcode will not be detected because of name restriction), afterwards it will be processed as you expect (result is yes!). I replied in the Grav repository asking @X-Ryl669 for a small reproduction script as well, as I'm still not sure how Shortcode it setup there and what syntax is used.

<?php
declare(strict_types=1);
namespace X;

use Thunder\Shortcode\HandlerContainer\HandlerContainer;
use Thunder\Shortcode\Parser\RegularParser;
use Thunder\Shortcode\Processor\Processor;
use Thunder\Shortcode\Shortcode\ProcessedShortcode;

require __DIR__.'/vendor/autoload.php';

$handlers = new HandlerContainer();
$handlers->add('Übersicht', fn(ProcessedShortcode $shortcode) => 'yes!');
$processor = new Processor(new RegularParser(), $handlers);
var_dump($processor->process('[Übersicht]'));
thunderer commented 4 months ago

@vmario89 were you able to reproduce the issue in a script you can share? I'm afraid I can't help you without that, everything works fine on my end. Let me know if you have any new information, otherwise I'll close the issue in the near future.

vmario89 commented 4 months ago

hey. i am sorry i was not able to find time again yet and i dont know how to do.. i am not a php developer. i am web admin and content maintainer. For me it's hard to think about on all edges and a.t.m. i have no clue how to make a proper test script in the right place :-( i will need to close this then