cocur / slugify

Converts a string to a slug. Includes integrations for Symfony, Silex, Laravel, Zend Framework 2, Twig, Nette and Latte.
MIT License
2.87k stars 251 forks source link

Problem with combining characters #275

Closed lostfocus closed 2 years ago

lostfocus commented 3 years ago

When trying to debug #274 I ran into a weird-ish problem. I got a string from a source that looked like a "normal" unicode string with two German umlauts, but it turned out to use Combining Characters (This StackOverflow question sent me down the right path in the end: Strange umlaut encoding on file system)

I'm not too sure if it's within the scope of this library to handle these kinds of characters, but I thought I'd put this information here anyway, in case someone else has the problem.

ausi commented 3 years ago

If you have the intl extension installed on your system you can use the Normalizer class to fix that issue:

$this->slugify->slugify(
    \Normalizer::normalize(
        "Erla\xCC\x88uterungsbroschu\xCC\x88re"
    )
);
florianeckerstorfer commented 3 years ago

@lostfocus Interesting. I think it should be possible to add the German umlauts with combining characters to the ruleset and it should work. But I have not tried that out.