Closed asmecher closed 3 years ago
Hi @asmecher! Thank you for that bug report. It's really strange which strange issues can occur. Thanks also for your code snipped from smarty-php, that helped me a lot.
Unfortunately is very hard to test this issue, since I cannot just change my system locale...
In order to test if my fix really works, please checkout the branch issue-93
and run the unit tests on the system with Turkish locales.
Thanks, @seboettg! Yes, this is a nasty one and pops up all over the place (see also https://bugs.php.net/bug.php?id=18556). I feel sorry for Turkish developers who run into this stuff while trying to learn!
I tried to run the tests, but I'm getting a lot of test failures (on the master
branch also) of the form:
TypeError: Argument 1 passed to Seboettg\Collection\ArrayList::__construct() must be of the type array, string given, called in /home/asmecher/git/citeproc-php/src/Rendering/Name/Names.php on line 148
TypeError: Argument 1 passed to Seboettg\Collection\ArrayList::__construct() must be of the type array, object given, called in /home/asmecher/git/citeproc-php/src/CiteProc.php on line 173
Maybe this is fixed elsewhere? (PHPUnit 8.5.2, PHP 7.4.3.)
I suspect you can get the testing to run under the Turkish locale by following these steps (Linux):
/etc/locale.gen
and add/uncomment the line: tr_TR.UTF-8 UTF-8
sudo locale-gen
php -r "setlocale(LC_ALL, 'tr_TR.UTF-8'); echo ucfirst('i');"
(This should yield a lower case i
instead of a capital I
like you expect)tests/bootstrap.php
and add: setlocale(LC_ALL, 'tr_TR.UTF-8');
I suspect you could add a new test that incorporates the setlocale
call, but checks the system ucfirst
under tr_TR.UTF-8
to see whether the weird behaviour exists. If it doesn't, then probably the tr_TR
locale isn't enabled in the OS (steps 1 and 2) and the test should be skipped.
Thank you @asmecher I use MacOS and unfortunately I have no other computer, except a raspberry pi with a minimal setup and without PHP. I tried to change the locale on my system, but it doesn't work:
$ php -r "setlocale(LC_ALL, 'tr_TR.UTF-8'); echo ucfirst('i');"
I%
But I think the unit tests might work when you run composer update
before. After that you can just run composer test
.
...
Time: 22.03 seconds, Memory: 28.00 MB
OK (199 tests, 429 assertions)
Thanks, that got it. The master
branch tests fine, but adding the setlocale
to bootstrap.php
breaks it as expected. Running the tests on issue-93
still shows some errors, however:
PHPUnit test output from issue-93 with setlocale
to tr_TR.UTF-8
Ah, ok! You must run composer install
or composer dump-autoload
after switching the branch to rebuild the autoload files.
Gotcha, but the log file linked above still shows some locale-specific assumptions that Turkish breaks. See e.g. src/Util/NumberHelper.php
, where strtoupper
is used on content containing roman numerals, and thus i
supplied as input won't get converted to I
:
1) Seboettg\CiteProc\Rendering\NumberTest::testRomanInputSingleNumber
A non-numeric value encountered
/home/asmecher/git/citeproc-php/src/Rendering/Number.php:146
/home/asmecher/git/citeproc-php/src/Rendering/Number.php:95
/home/asmecher/git/citeproc-php/tests/src/Rendering/NumberTest.php:130
2) Seboettg\CiteProc\Rendering\NumberTest::testRomanInputRangeNumber
A non-numeric value encountered
/home/asmecher/git/citeproc-php/src/Rendering/Number.php:146
/home/asmecher/git/citeproc-php/src/Rendering/Number.php:95
/home/asmecher/git/citeproc-php/tests/src/Rendering/NumberTest.php:137
3) Seboettg\CiteProc\Util\NumberHelperTest::testRoman2Dec
Undefined index: i
/home/asmecher/git/citeproc-php/src/Util/NumberHelper.php:101
/home/asmecher/git/citeproc-php/tests/src/Util/NumberHelperTest.php:35
In NumberHelper
strlen
and strtoupper
was used. I replaced these functions with mb_strlen
and mb_strtoupper
. I hope this fix will work.
Very close!
There were 2 errors:
1) Seboettg\CiteProc\Rendering\NumberTest::testRomanInputSingleNumber
A non-numeric value encountered
/home/asmecher/git/citeproc-php/src/Rendering/Number.php:146
/home/asmecher/git/citeproc-php/src/Rendering/Number.php:95
/home/asmecher/git/citeproc-php/tests/src/Rendering/NumberTest.php:130
2) Seboettg\CiteProc\Rendering\NumberTest::testRomanInputRangeNumber
A non-numeric value encountered
/home/asmecher/git/citeproc-php/src/Rendering/Number.php:146
/home/asmecher/git/citeproc-php/src/Rendering/Number.php:95
/home/asmecher/git/citeproc-php/tests/src/Rendering/NumberTest.php:137
--
There was 1 failure:
1) Seboettg\CiteProc\Util\NumberHelperTest::testIsRomanNumber
Failed asserting that false is true.
/home/asmecher/git/citeproc-php/tests/src/Util/NumberHelperTest.php:41
Okay... once again, please! 👍
Down to 1!
There was 1 error:
1) Seboettg\CiteProc\Rendering\NumberTest::testRomanInputRangeNumber
A non-numeric value encountered
/home/asmecher/git/citeproc-php/src/Rendering/Number.php:146
/home/asmecher/git/citeproc-php/src/Rendering/Number.php:95
/home/asmecher/git/citeproc-php/tests/src/Rendering/NumberTest.php:137
Oh no! I'm so sorry to have to bother you with these tests.
Ok, I have now set up my raspi so that I can do these tests by myself.
I figured out, that there seems to be a problem with case insensitive regular expressions...
/^([ivxlcdm]+\.*)\s*([*\–\-&+,;])\s*([ivxlcdm]+\.?)$/i
For example this pattern does not work with Turkish locale.
I guess all issues are solved now. @asmecher can you confirm that?
Tests pass!
OK (199 tests, 429 assertions)
Ok, version 2.2.2 is released and contains these fixes. I will close this ticket. If you think additional doings are necessary, feel free to reopen this bug report.
Thanks for your excellent work as always, @seboettg!
Please follow the general troubleshooting steps first:
Bug reports:
Factory.inc.php
usesucfirst
to assemble a constraint class name for inclusion. However, it uses theucfirst
function to do so, and upper/lower case behaviour in the current locale may not match the code's expectations.For example, when using the
tr_TR
locale,ucfirst('i')
yields"i"
, leading to:(This is a common Turkish locale gotcha; see e.g. https://github.com/smarty-php/smarty/issues/155 and the proposed fix at https://github.com/smarty-php/smarty/pull/586. I have not checked elsewhere to see if other locale-sensitive functions are used in this way in this library.)
Originally reported by an OJS end user: https://forum.pkp.sfu.ca/t/articles-in-the-second-locale-language-do-not-show/62967
Used CSL stylesheet:
(Any)
Used CSL metadata
(Any)