craftcms / cms

Build bespoke content experiences with Craft.
https://craftcms.com
Other
3.28k stars 635 forks source link

[4.x]: Php Intl extension using \u202f character in pattern string, and causing some unexpected behavior. #13381

Closed myleshyson closed 1 year ago

myleshyson commented 1 year ago

What happened?

Description

This is a really weird one that took a couple days to figure out, and isn't even a bug with Craft itself per se, but something I think worth addressing.

The php-intl extension, version 72.1, uses the unicode character \u202f in the IntlDateFormatter::pattern string. This caused us a really weird issue when using the time picker field, since all of those time picker options are formatted based on that string (see CpAsset::633).

For our project, I was taking the value of the timepicker field and passing it into DateTimeHelper::toDateTime(), and it was returning false because date formatters don't know what do do with a string like "2:00\u202fAM". This problem was only happening on our stage and prod servers, and turns out they were both using php-intl:72.1, and not happening on our local and test servers which where running php-intl:71.1 and thus using non-unicode pattern strings like "2:00 AM".

I fixed our side of it by just removing all non-ascii characters in the post body when saving our custom element. However I'm wondering if other people are experiencing this issue as well and not even realizing it.

Not sure where else that IntlDateFormatter is being used in the project, but what do you think about a "fix" to Locale::_getDateTimeFormat by replacing any unicode space character with a normal space?

Again, it's a super weird issue, and because it's a php-extension change I don't even know if it's technically a bug on their end, or if they really meant to use that unicode character instead of a space. I'm posting the problem here because that character is in the jquery timepicker and datepicker options now, and thus the datetime field, and because of that if people are expecting non-unicode strings to work with from those form inputs, they're gonna run into a bad time (like I did lol).

In any case figured ya'll would wanna know about it.

Steps to reproduce

  1. Include the intl php extension, and make sure its version "72.1"
  2. Instantiate a new IntlDateFormatter instance, for example
    $dateFormatter = new IntlDateFormatter('en_US', -1, 3);
  3. Echo out the json encoded format string.
    echo json_encode($dateFormatter->getPattern());
  4. See that it's h:mm\u202fa.
  5. Instantiate a date time picker field, look at the time options in html, and if you copy one of the option values into a unicode tester online, you'll see it's got that \u202f character in it.

Craft CMS version

4.4.14

PHP version

Stage: 8.1.20, Prod: 8.2.6, Test: 8.1.20, Local: 8.1.7

Operating system and version

No response

Database type and version

No response

Image driver and version

No response

Installed plugins and versions

-

myleshyson commented 1 year ago

Alright more context. I also opened up an issue on the php src repo to see if they will let strtotime support the NBBS character. here.

The ICU library updated to 72.1, and in that updated started using Unicode 15 and CLDR 42, with the narrow non-breaking space character (I think) being introduced via CLDR 42.

This issue is causing issues in other languages as well, like node.

brandonkelly commented 1 year ago

This caused us a really weird issue when using the time picker field

As in the built-in “Time” field type? Or something custom that renders a time input, e.g. using the time()/timeField() macros from the _includes/forms template, or Craft.ui.createTimeField()/createTimeInput()?

Just tested locally with a Time cusom field, and it seems to handle the u202f character without any issues.

For our project, I was taking the value of the timepicker field and passing it into DateTimeHelper::toDateTime(), and it was returning false because date formatters don't know what do do with a string like "2:00\u202fAM".

How exactly are you passing it to toDateTime()? If you pass it as a time key in an array, it should work.

DateTimeHelper::toDateTime(['time' => $value])
myleshyson commented 1 year ago

I am using a date array, but I'm doing a little massaging by forcing YYYY-MM-DD and HH:MM formats before passing it into toDateTime, by using strtotime. This was to try and fix a bug some people were experiencing where values in the datetime field wouldn't autosave. It only happened when date and string values skipped DateTimeHelper::_parseDate:811-817 and DateTimeHelper::_parseTime:857-859. This ended up not working for the time key though, because strtotime doesn't know how to deal with \u202f.

After some more digging, I realized that editors were probably typing or pasting in time values instead of selecting (if they needed something like 2:15 PM).

You can see this problem on a base craft install currently with php running intl:72.1.

If there's a space in the time string, DateTimeHelper returns false. If there's a \u202f character, it returns a datetime object. You can test it out by dumping the following.

// returns false
$date = DateTimeHelper::toDateTime(['time' => '12:00 PM']);
// returns a datetime object
$date = DateTimeHelper::toDateTime(['time' => '12:00 PM']);

And if you were to create a base craft install, with php-intl:72.1 running, create an entry with a datetime field, and paste each of those values into the time field, and hit ESC instead of TAB or clicking to hide the jquery timepicker options, you'll see that with the space character the field doesn't save the value, and with the NNBS it saves correctly.

Looks like the timepicker does reformat to use a \u202f character when clicking or hitting ENTER or TAB to close the time options. If you hit ESC to close those options, or do CMD+S before giving a chance to the timepicker to reformat to use \u202f, the field won't save correctly.

It was happening enough that I ended up doing the nuclear option, and just removing all unicode characters from the post body.

brandonkelly commented 1 year ago

And if you were to create a base craft install, with php-intl:72.1 running, create an entry with a datetime field, and paste each of those values into the time field, and hit ESC instead of TAB or clicking to hide the jquery timepicker options, you'll see that with the space character the field doesn't save the value, and with the NNBS it saves correctly.

I’m not able to reproduce that. No matter what I do, as soon as the input is blurred, the value will be reformatted to use a u202f character instead of a space.

https://github.com/craftcms/cms/assets/47792/c849a655-d558-4959-a3b7-0708ba2abcdd

myleshyson commented 1 year ago

Yea so the video you showed is what I said earlier.

Looks like the timepicker does reformat to use a \u202f character when clicking or hitting ENTER or TAB to close the time options.

You're pasting a time and then clicking outside the field to blur it. In that case the string will get reformatted to use \u202f.

However when hitting cmd+s before the field is blurred, OR hitting Esc and then cmd+s, the field saves incorrectly. I'm able to reproduce this consistently on a base craft install.

When I type a time and cmd+s without blurring the field, the wrong time is saved. https://github.com/craftcms/cms/assets/11747187/c26be37b-f7f2-4838-823a-0856f1aeb436

When I paste a time and hit cmd+s without blurring the field, the wrong time is saved. https://github.com/craftcms/cms/assets/11747187/b5ca462c-4428-4318-97f1-6f4bcdfddf72

When I type in a time and hit escape then cmd+s, the wrong time is saved. https://github.com/craftcms/cms/assets/11747187/bc476228-1fbe-4c81-87b9-1f2562c4a538

The actual problem here though is that DateTimeHelper::toDateTime returns false when a valid formatted time has a space in it instead of a \u202f character.

If there's a space in the time string, DateTimeHelper returns false. If there's a \u202f character, it returns a datetime object. You can test it out by dumping the following.

// returns false
$date = DateTimeHelper::toDateTime(['time' => '12:00 PM']);
// returns a datetime object
$date = DateTimeHelper::toDateTime(['time' => '12:00 PM']);
brandonkelly commented 1 year ago

However when hitting cmd+s before the field is blurred, OR hitting Esc and then cmd+s, the field saves incorrectly. I'm able to reproduce this consistently on a base craft install.

Ah ok, yeah I can reproduce that. Just added some code to defend against that, for the next release.

brandonkelly commented 1 year ago

Craft 4.4.16 is out with that fix. Thanks for the help!

myleshyson commented 1 year ago

Awesome! And no problem. Thanks for being willing to work with me on it. Definitely one of the more obscure bugs I've come across. PHP added a fix for 8.2.9 for strtotime. Not sure yet if that also applies to DateTime::createFromFormat().