php-gettext / Gettext

PHP library to collect and manipulate gettext (.po, .mo, .php, .json, etc)
MIT License
687 stars 134 forks source link

Fix UTF-8 handling for VueJs scanner. #242

Closed briedis closed 4 years ago

briedis commented 4 years ago

DOMDocument does not handle UTF8 well without using htmlentities() or prepending XM encoding tag.

Added a fix and a test.

MarekGogol commented 4 years ago

coincidence? #241 :)

briedis commented 4 years ago

@MarekGogol goddamnit. We just noticed this issue on Friday.

What do you think if this solution (prepending XML encoding tag)?

Calling htmlentities to convert characters seems a bit risky, not sure I trust PHP enough with encoding conversions (bad experience). Ran some benchmarks - xml tag approach is a tiny fraction faster too.

MarekGogol commented 4 years ago

I think your solution is good enough. VueJs template may not consist of <xml> tag, as you sad...

To be honest, I am not 100% sure which one is better. I just brought solution from link below. But your solution might be faster, so we can apply yours... https://stackoverflow.com/questions/8218230/php-domdocument-loadhtml-not-encoding-utf-8-correctly?answertab=active#tab-top

We should wait for the author of package, and we will see what next.

briedis commented 4 years ago

@MarekGogol Yeah, that is the same SO post I was looking at 😂

I'm really happy that people are using the vue scanner and even contributing. Was not expecting that! (I was the author for vue scanner code)

briedis commented 4 years ago

@oscarotero sorry to bother again, but without the fix package is unusable in production. 😬

MarekGogol commented 4 years ago

@oscarotero sorry to bother again, but without the fix package is unusable in production. 😬

So thanks @briedis ! VueJs is one of my favorite scanner :). I just didn't realize for one year that all my existing translations from VueJs templates in client projects are crap :D (nobody told me). This week I was working on my own project, and I find it out 😄

image

oscarotero commented 4 years ago

Ok. I have a package for that (https://github.com/oscarotero/html-parser/blob/master/src/Parser.php) but wouldn't like to add a dependency for this (maybe in v5, because we have a different package for each scanner). I'm merging this.