Closed manuelpuyol closed 1 month ago
Hey @manuelpuyol, it's exciting to see this progress! I noticed you're sharing code differently than the original MS implementation, and I wanted to suggest a further slight tweak that will allow much more code to live in PlatformIntlShared.cpp
.
This is just something to consider as you're thinking about it, but we can move forward with this PR regardless of whether you adopt it.
The main idea is to have PlatformIntlShared
as a middle layer between the PlatformIntl
and PlatformIntlICU
. So instead of having PlatformIntlICU
call into PlatformIntlShared
for common dependencies, we have PlatformIntlShared
call into PlatformIntlICU
for platform specific code. That makes it easy to share the code, since a platform only has to implement a simple well defined API.
So using Collator
as an example, it would look roughly like:
#include "hermes/Platform/Intl/PlatformIntl.h"
class CollatorShared : public Collator {
public:
CollatorShared() = default;
vm::ExecutionStatus initialize(
vm::Runtime &runtime,
const std::vector<std::u16string> &locales,
const Options &options) noexcept;
// Just an example, the signature might need to be different.
void platformInitialize();
Options resolvedOptions() noexcept;
double compare(const std::u16string &x, const std::u16string &y) noexcept;
private:
std::u16string locale_;
std::u16string usage_;
std::u16string collation_;
std::u16string caseFirst_;
std::u16string sensitivity_;
bool numeric_;
bool ignorePunctuation_;
};
vm::ExecutionStatus CollatorApple::initialize(
vm::Runtime &runtime,
const std::vector<std::u16string> &locales,
const Options &options) noexcept {
// Do platform agnostic init first.
platformInitialize();
}
class CollatorICU : public CollatorShared {
public:
CollatorICU() = default;
// Just an example, the signature might need to be different.
void platformInitialize();
double compare(const std::u16string &x, const std::u16string &y) noexcept;
private:
// ICU specific fields
};
double CollatorShared::compare(
const std::u16string &x,
const std::u16string &y) noexcept {
return static_cast<CollatorICU *>(this)->compare(x, y);
}
void CollatorShared::platformInitialize() noexcept {
return static_cast<CollatorICU *>(this)->platformInitialize();
}
👋 @neildhar
I've been thinking more about the "Shared" implementation... If the goal is to migrate everything to ICU in the future, won't the Apple and Android implementations be deprecated? At that point having a shared library may be more burdensome that helpful since it does introduce extra code complexity. So maybe it's worth getting rid of PlatformIntlShared
and putting all that code back in PlatformIntlICU
.
However, if you feel like that's a distant future, I'd be happy to work on a better code sharing strategy, though I think that'd be more appropriate in a follow-up PR if that's ok for you :)
@manuelpuyol Of course, let's worry about that later
Hey @neildhar, to be honest Ijust decided to follow hermes-windows approach but don't have the full background on that decision.
But, given that the DateTimeFormat frals with char16_t
and ICU methods expect a char
, what would you suggest as the best way to handle that conversion?
@neildhar has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.
Hey @manuelpuyol, I just pulled this in and tried building, but it looks like it doesn't compile. Have you been able to test this on your machine?
@neildhar that's... interesting I can compile it locally and looks like circleci can too? What errors are you getting?
@manuelpuyol has updated the pull request. You must reimport the pull request before landing.
CircleCI doesn't currently build with HERMES_ENABLE_INTL
on linux, so it isn't actually building the new implementation. When I turn it on and build on linux, here are some of the errors I see:
hermes/lib/Platform/Intl/PlatformIntlICU.cpp:275:37: error: invalid conversion from ‘const short unsigned int*’ to ‘const char16_t*’ [-fpermissive]
auto upper = toASCIIUppercase(zoneId);
hermes/lib/Platform/Intl/PlatformIntlICU.cpp:1029:7: error: invalid conversion from ‘__gnu_cxx::__alloc_traits<std::allocator<char16_t>, char16_t>::value_type*’ {aka ‘char16_t*’} to ‘const UChar*’ {aka ‘const short unsigned int*’} [-fpermissive]
&skeleton[0],
hermes/lib/Platform/Intl/PlatformIntlICU.cpp:1041:9: error: invalid conversion from ‘__gnu_cxx::__alloc_traits<std::allocator<char16_t>, char16_t>::value_type*’ {aka ‘char16_t*’} to ‘const UChar*’ {aka ‘const short unsigned int*’} [-fpermissive]
&skeleton[0],
hermes/lib/Platform/Intl/PlatformIntlICU.cpp:1044:9: error: invalid conversion from ‘__gnu_cxx::__alloc_traits<std::allocator<char16_t>, char16_t>::value_type*’ {aka ‘char16_t*’} to ‘UChar*’ {aka ‘short unsigned int*’} [-fpermissive]
&bestpattern[0],
hermes/lib/Platform/Intl/PlatformIntlICU.cpp:1060:9: error: invalid conversion from ‘__gnu_cxx::__alloc_traits<std::allocator<char16_t>, char16_t>::value_type*’ {aka ‘char16_t*’} to ‘const UChar*’ {aka ‘const short unsigned int*’} [-fpermissive]
&bestpattern[0],
hmm I wonder if that's being caused by different ICU versions, do you know which version you are using to build?
Taking a closer look, I think this might be a problem of differing definitions of char16_t. This build was with GCC, are you building with Clang?
Looking at ICU's source, they did change the return type of unext
and technically Hermes supports both versions since it looks for ICU > 52:
Taking a closer look, I think this might be a problem of differing definitions of char16_t. This build was with GCC, are you building with Clang?
I'm using gcc
Also looks like the UChar
definition changed somewhere down the road too:
@manuelpuyol has updated the pull request. You must reimport the pull request before landing.
ok, looks like ICU lets us define the UChar
type to be less platform dependent. I pushed https://github.com/facebook/hermes/pull/1362/commits/49037cebca4e8fd48ce6e4a6ab5e516898756a04 which may solve the incompatibility we are seeing
@neildhar has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.
I'm gonna take a look at the apple test! As for the test262:
$ utils/testsuite/run_testsuite.py --test-intl -b build/bin/ ~/test262/test/intl402/DateTimeFormat/
-----------------------------------
| Results | PASS |
|----------------------+----------|
| Total | 176 |
| Pass | 50 |
| Fail | 0 |
| Skipped | 125 |
| Permanently Skipped | 1 |
| Pass Rate | 100.00% |
-----------------------------------
| Failures | |
|----------------------+----------|
| Compile fail | 0 |
| Compile timeout | 0 |
| Execute fail | 0 |
| Execute timeout | 0 |
To start, I ran it against our date-time-format-apple.js test, and it looks like there might be a bug in handling dateStyle/timeStyle when only one of them is specified.
mind sharing a specific example I can take a look?
@manuelpuyol has updated the pull request. You must reimport the pull request before landing.
Sure, if you run LIT_FILTER=date-time-format-apple.js ninja check-hermes
, that should run that specific test and print failures.
@neildhar https://github.com/facebook/hermes/pull/1362/commits/7c578953e745b995a401d6cddd59562190e90509 should fix the case when timeStyle
or dateStyle
is missing!
@manuelpuyol has updated the pull request. You must reimport the pull request before landing.
@neildhar I'm noticing that date-time-format-apple.js
has some expectations that differ from Node's implementations, so I'm having a hard time deciding which results to follow or if some differences are acceptable 😅
Minor differences are inevitable, we should relax the match patterns slightly where necessary to accommodate those differences, since they likely just come up from different ICU versions being used.
We run this test using FileCheck
, which allows you to place a regex in the pattern (see here).
It should just be a matter of turning a few characters into wildcards. If the differences are more substantial, that may indicate a bug in one of the implementations.
just to keep track of it, these are the differences I'm seeing in that test:
const date = new Date(Date.UTC(2020, 0, 2, 3, 45, 00, 30));
const oldDate = new Date(Date.UTC(1952, 0, 9, 8, 04, 03, 05));
// apple and node: 3
// ICU: 03
new Intl.DateTimeFormat('en-GB', {second: '2-digit'}).format(oldDate)
// apple: 午前3:45
// ICU and node: 3:45
new Intl.DateTimeFormat('ja-JP', {hour: 'numeric', minute: 'numeric'}).format(date);
// apple: 午前3:45
// ICU and node: 03:45
new Intl.DateTimeFormat('ja-JP', {hour: '2-digit', minute: '2-digit'}).format(date);
// apple and ICU: zh
// node: zh-CN
new Intl.DateTimeFormat('zh-CN').resolvedOptions().locale
// apple and ICU: undefined
// node: latn
new Intl.DateTimeFormat('en-US').resolvedOptions().numberingSystem;
// apple: SGT
// node and ICU: Error Invalid time zone specified
new Intl.DateTimeFormat('en-US', { timeZone: 'SGT'}).resolvedOptions().timeZone
@neildhar if you are comfortable with those differences, I can push a change to the test :)
The Japanese date patterns seem potentially wrong. It looks like the DateTime pattern generation here doesn't quite follow the spec rules. For example, the 23 hour cycle should be using H
and HH
rather than k
and KK
.
@manuelpuyol has updated the pull request. You must reimport the pull request before landing.
indeed! I updated it and now ICU outputs the same as node for that case
The rest seem to be genuine differences in the defaults. We don't need to solve this right away though, given that the implementation is incomplete (so we can't run tests anyway).
@neildhar has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.
@manuelpuyol has updated the pull request. You must reimport the pull request before landing.
@neildhar has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.
@manuelpuyol has updated the pull request. You must reimport the pull request before landing.
@neildhar merged this pull request in facebook/hermes@042024c49348aabcc4b7fe5aad7c28411af74419.
Related to https://github.com/facebook/hermes/issues/1350 and https://github.com/facebook/hermes/discussions/1211 Follow up of https://github.com/facebook/hermes/pull/1361
Summary
I'd like to expand non iOS/android support of
Intl
. In this case, I'm implementing bothDateTimeFormat.prototype.resolvedOptions
andDateTimeFormat.prototype.format
. The implementation is a mix from the Apple and hermes-windows' implementations.PlatformIntlShared
with helper functions that are exactly the same between Apple and ICUDateTimeFormatICU
based onDateTimeFormatApple
Test Plan
Testing against
test262/test/intl402/DateTimeFormat