facebook / hermes

A JavaScript engine optimized for running React Native.
https://hermesengine.dev/
MIT License
9.41k stars 596 forks source link

Implement DateTimeFormat.format for non android/ios environments #1362

Closed manuelpuyol closed 1 month ago

manuelpuyol commented 1 month ago

Related to https://github.com/facebook/hermes/issues/1350 and https://github.com/facebook/hermes/discussions/1211 Follow up of https://github.com/facebook/hermes/pull/1361

Summary

I'd like to expand non iOS/android support of Intl. In this case, I'm implementing both DateTimeFormat.prototype.resolvedOptions and DateTimeFormat.prototype.format. The implementation is a mix from the Apple and hermes-windows' implementations.

Test Plan

Testing against test262/test/intl402/DateTimeFormat

neildhar commented 1 month ago

Hey @manuelpuyol, it's exciting to see this progress! I noticed you're sharing code differently than the original MS implementation, and I wanted to suggest a further slight tweak that will allow much more code to live in PlatformIntlShared.cpp.

This is just something to consider as you're thinking about it, but we can move forward with this PR regardless of whether you adopt it.

The main idea is to have PlatformIntlShared as a middle layer between the PlatformIntl and PlatformIntlICU. So instead of having PlatformIntlICU call into PlatformIntlShared for common dependencies, we have PlatformIntlShared call into PlatformIntlICU for platform specific code. That makes it easy to share the code, since a platform only has to implement a simple well defined API.

So using Collator as an example, it would look roughly like:

PlatformIntlShared.h:

#include "hermes/Platform/Intl/PlatformIntl.h"

class CollatorShared : public Collator {
 public:
  CollatorShared() = default;

  vm::ExecutionStatus initialize(
      vm::Runtime &runtime,
      const std::vector<std::u16string> &locales,
      const Options &options) noexcept;

  // Just an example, the signature might need to be different.
  void platformInitialize();

  Options resolvedOptions() noexcept;

  double compare(const std::u16string &x, const std::u16string &y) noexcept;

 private:
  std::u16string locale_;
  std::u16string usage_;
  std::u16string collation_;
  std::u16string caseFirst_;
  std::u16string sensitivity_;
  bool numeric_;
  bool ignorePunctuation_;
};

PlatformIntlShared.cpp

vm::ExecutionStatus CollatorApple::initialize(
    vm::Runtime &runtime,
    const std::vector<std::u16string> &locales,
    const Options &options) noexcept {
  // Do platform agnostic init first.
  platformInitialize();
}

PlatformIntlICU.cpp

class CollatorICU : public CollatorShared {
 public:
  CollatorICU() = default;

  // Just an example, the signature might need to be different.
  void platformInitialize();

  double compare(const std::u16string &x, const std::u16string &y) noexcept;

 private:
   // ICU specific fields
};

double CollatorShared::compare(
    const std::u16string &x,
    const std::u16string &y) noexcept {
  return static_cast<CollatorICU *>(this)->compare(x, y);
}

void CollatorShared::platformInitialize() noexcept {
  return static_cast<CollatorICU *>(this)->platformInitialize();
}
manuelpuyol commented 1 month ago

👋 @neildhar

I've been thinking more about the "Shared" implementation... If the goal is to migrate everything to ICU in the future, won't the Apple and Android implementations be deprecated? At that point having a shared library may be more burdensome that helpful since it does introduce extra code complexity. So maybe it's worth getting rid of PlatformIntlShared and putting all that code back in PlatformIntlICU.

However, if you feel like that's a distant future, I'd be happy to work on a better code sharing strategy, though I think that'd be more appropriate in a follow-up PR if that's ok for you :)

neildhar commented 1 month ago

@manuelpuyol Of course, let's worry about that later

manuelpuyol commented 1 month ago

Hey @neildhar, to be honest Ijust decided to follow hermes-windows approach but don't have the full background on that decision.

But, given that the DateTimeFormat frals with char16_t and ICU methods expect a char, what would you suggest as the best way to handle that conversion?

facebook-github-bot commented 1 month ago

@neildhar has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

neildhar commented 1 month ago

Hey @manuelpuyol, I just pulled this in and tried building, but it looks like it doesn't compile. Have you been able to test this on your machine?

manuelpuyol commented 1 month ago

@neildhar that's... interesting I can compile it locally and looks like circleci can too? What errors are you getting?

facebook-github-bot commented 1 month ago

@manuelpuyol has updated the pull request. You must reimport the pull request before landing.

neildhar commented 1 month ago

CircleCI doesn't currently build with HERMES_ENABLE_INTL on linux, so it isn't actually building the new implementation. When I turn it on and build on linux, here are some of the errors I see:

hermes/lib/Platform/Intl/PlatformIntlICU.cpp:275:37: error: invalid conversion from ‘const short unsigned int*’ to ‘const char16_t*’ [-fpermissive]
       auto upper = toASCIIUppercase(zoneId);
hermes/lib/Platform/Intl/PlatformIntlICU.cpp:1029:7: error: invalid conversion from ‘__gnu_cxx::__alloc_traits<std::allocator<char16_t>, char16_t>::value_type*’ {aka ‘char16_t*’} to ‘const UChar*’ {aka ‘const short unsigned int*’} [-fpermissive]
       &skeleton[0],
hermes/lib/Platform/Intl/PlatformIntlICU.cpp:1041:9: error: invalid conversion from ‘__gnu_cxx::__alloc_traits<std::allocator<char16_t>, char16_t>::value_type*’ {aka ‘char16_t*’} to ‘const UChar*’ {aka ‘const short unsigned int*’} [-fpermissive]
         &skeleton[0],
hermes/lib/Platform/Intl/PlatformIntlICU.cpp:1044:9: error: invalid conversion from ‘__gnu_cxx::__alloc_traits<std::allocator<char16_t>, char16_t>::value_type*’ {aka ‘char16_t*’} to ‘UChar*’ {aka ‘short unsigned int*’} [-fpermissive]
         &bestpattern[0],
hermes/lib/Platform/Intl/PlatformIntlICU.cpp:1060:9: error: invalid conversion from ‘__gnu_cxx::__alloc_traits<std::allocator<char16_t>, char16_t>::value_type*’ {aka ‘char16_t*’} to ‘const UChar*’ {aka ‘const short unsigned int*’} [-fpermissive]
         &bestpattern[0],
manuelpuyol commented 1 month ago

hmm I wonder if that's being caused by different ICU versions, do you know which version you are using to build?

neildhar commented 1 month ago

Taking a closer look, I think this might be a problem of differing definitions of char16_t. This build was with GCC, are you building with Clang?

manuelpuyol commented 1 month ago

Looking at ICU's source, they did change the return type of unext and technically Hermes supports both versions since it looks for ICU > 52:

version 66: https://github.com/unicode-org/icu/blob/3ca5d8ca1559fb52d08bad61e350f0285f3edb37/icu4c/source/common/unicode/strenum.h#L157

version 57: https://github.com/unicode-org/icu/blob/b7c2034642ce01ba6ffb9418f26234efb2549d08/icu4c/source/common/unicode/strenum.h#L154

manuelpuyol commented 1 month ago

Taking a closer look, I think this might be a problem of differing definitions of char16_t. This build was with GCC, are you building with Clang?

I'm using gcc

manuelpuyol commented 1 month ago

Also looks like the UChar definition changed somewhere down the road too:

https://github.com/unicode-org/icu/blob/3ca5d8ca1559fb52d08bad61e350f0285f3edb37/icu4c/source/common/unicode/umachine.h#L360-L379

https://github.com/unicode-org/icu/blob/574e7d9d55760680ea14dbfc4908429a58c5d544/icu4c/source/common/unicode/umachine.h#L269-L279

facebook-github-bot commented 1 month ago

@manuelpuyol has updated the pull request. You must reimport the pull request before landing.

manuelpuyol commented 1 month ago

ok, looks like ICU lets us define the UChar type to be less platform dependent. I pushed https://github.com/facebook/hermes/pull/1362/commits/49037cebca4e8fd48ce6e4a6ab5e516898756a04 which may solve the incompatibility we are seeing

facebook-github-bot commented 1 month ago

@neildhar has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

manuelpuyol commented 1 month ago

I'm gonna take a look at the apple test! As for the test262:

$ utils/testsuite/run_testsuite.py --test-intl -b build/bin/ ~/test262/test/intl402/DateTimeFormat/

-----------------------------------
| Results              |   PASS   |
|----------------------+----------|
| Total                |      176 |
| Pass                 |       50 |
| Fail                 |        0 |
| Skipped              |      125 |
| Permanently Skipped  |        1 |
| Pass Rate            |  100.00% |
-----------------------------------
| Failures             |          |
|----------------------+----------|
| Compile fail         |        0 |
| Compile timeout      |        0 |
| Execute fail         |        0 |
| Execute timeout      |        0 |
manuelpuyol commented 1 month ago

To start, I ran it against our date-time-format-apple.js test, and it looks like there might be a bug in handling dateStyle/timeStyle when only one of them is specified.

mind sharing a specific example I can take a look?

facebook-github-bot commented 1 month ago

@manuelpuyol has updated the pull request. You must reimport the pull request before landing.

neildhar commented 1 month ago

Sure, if you run LIT_FILTER=date-time-format-apple.js ninja check-hermes, that should run that specific test and print failures.

manuelpuyol commented 1 month ago

@neildhar https://github.com/facebook/hermes/pull/1362/commits/7c578953e745b995a401d6cddd59562190e90509 should fix the case when timeStyle or dateStyle is missing!

facebook-github-bot commented 1 month ago

@manuelpuyol has updated the pull request. You must reimport the pull request before landing.

manuelpuyol commented 1 month ago

@neildhar I'm noticing that date-time-format-apple.js has some expectations that differ from Node's implementations, so I'm having a hard time deciding which results to follow or if some differences are acceptable 😅

neildhar commented 1 month ago

Minor differences are inevitable, we should relax the match patterns slightly where necessary to accommodate those differences, since they likely just come up from different ICU versions being used.

We run this test using FileCheck, which allows you to place a regex in the pattern (see here).

It should just be a matter of turning a few characters into wildcards. If the differences are more substantial, that may indicate a bug in one of the implementations.

manuelpuyol commented 1 month ago

just to keep track of it, these are the differences I'm seeing in that test:

const date = new Date(Date.UTC(2020, 0, 2, 3, 45, 00, 30));
const oldDate = new Date(Date.UTC(1952, 0, 9, 8, 04, 03, 05));
// apple and node: 3
// ICU: 03
new Intl.DateTimeFormat('en-GB', {second: '2-digit'}).format(oldDate)

// apple: 午前3:45
// ICU and node: 3:45
new Intl.DateTimeFormat('ja-JP', {hour: 'numeric', minute: 'numeric'}).format(date);

// apple: 午前3:45
// ICU and node: 03:45
new Intl.DateTimeFormat('ja-JP', {hour: '2-digit', minute: '2-digit'}).format(date);

// apple and ICU: zh
// node: zh-CN
new Intl.DateTimeFormat('zh-CN').resolvedOptions().locale

// apple and ICU: undefined
// node: latn
new Intl.DateTimeFormat('en-US').resolvedOptions().numberingSystem;

// apple: SGT
// node and ICU: Error Invalid time zone specified
new Intl.DateTimeFormat('en-US', { timeZone: 'SGT'}).resolvedOptions().timeZone
manuelpuyol commented 1 month ago

@neildhar if you are comfortable with those differences, I can push a change to the test :)

neildhar commented 1 month ago

The Japanese date patterns seem potentially wrong. It looks like the DateTime pattern generation here doesn't quite follow the spec rules. For example, the 23 hour cycle should be using H and HH rather than k and KK.

facebook-github-bot commented 1 month ago

@manuelpuyol has updated the pull request. You must reimport the pull request before landing.

manuelpuyol commented 1 month ago

indeed! I updated it and now ICU outputs the same as node for that case

neildhar commented 1 month ago

The rest seem to be genuine differences in the defaults. We don't need to solve this right away though, given that the implementation is incomplete (so we can't run tests anyway).

facebook-github-bot commented 1 month ago

@neildhar has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot commented 1 month ago

@manuelpuyol has updated the pull request. You must reimport the pull request before landing.

facebook-github-bot commented 1 month ago

@neildhar has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot commented 1 month ago

@manuelpuyol has updated the pull request. You must reimport the pull request before landing.

facebook-github-bot commented 1 month ago

@neildhar merged this pull request in facebook/hermes@042024c49348aabcc4b7fe5aad7c28411af74419.