Closed nojvek closed 7 years ago
If you don't mind, I'd like to make it a personal exercise to fix this bug with a PR. So far I've got chakra to build. Its amazing how easy it was. I see the baseline tests and I'll spend some time this weekend to get this fixed and make sure I don't break anything else.
This isn't a RegExp issue. It looks like we put LEFT-TO-RIGHT MARK (U+200E) around punctuation marks, but Chrome and Firefox don't.
On Edge:
var d = new Date("2016-03-17T21:43:10.645Z").toLocaleString('de');
d === "17.03.2016 14:43:10";
d[1] === "1";
d[2] === "7";
d.charCodeAt(3) === 8206;
d[4] === ".";
d.charCodeAt(5) === 8206;
d[6] === "0";
On Chrome:
var d = new Date("2016-03-17T21:43:10.645Z").toLocaleString('de');
d === "17.3.2016 14:43:10";
d[0] === "1";
d[1] === "7";
d[2] === ".";
d[3] === "3";
I don't know why we have the "LEFT-TO-RIGHT MARK (U+200E)" around punctuation marks though.
Yeah that's what I just noticed.
d = new Date().toLocaleString();
WScript.Echo(d);
$ ch dateLocale.js
?3?/?17?/?2016? ?2?:?56?:?45? ?PM
I'll change the title of the bug.
Could this be an issue in the windows GetDateFormatW ?
How do I start ch.exe on a file and break it so I can attach a debugger?
I don't see a debugging page.
// For Windows Vista and above GetDateFormatEx is preferred
WINBASEAPI
int
WINAPI
GetDateFormatW(
_In_ LCID Locale,
_In_ DWORD dwFlags,
_In_opt_ CONST SYSTEMTIME * lpDate,
_In_opt_ LPCWSTR lpFormat,
_Out_writes_opt_(cchDate) LPWSTR lpDateStr,
_In_ int cchDate
);
If you're using Visual Studio, you can just set the command line parameters for ch by right clicking on the ch project, and setting the appropriate field under the Debugging Configuration Property. Then, set your breakpoints in VS, hit F5 and have fun.
If you want to use another debugger, you can install the Windows Debugging Tools by following the instructions at https://msdn.microsoft.com/en-us/library/windows/hardware/ff551063(v=vs.85).aspx. You can then run windbg.exe ch.exe
From what I gather, it seems, some of the logic is in Intl.js. This is probably byte code generated and I don't seem to figure out where the implementation of formatDateTime is.
I got debugging to work. I added a 10 sec sleep so I could attach VS.
WScript.Echo(d.toLocaleString());
The breakpoint doesn't seem to hit GetDateLocaleString as I previously hypothesized.
DateImplementation::GetDateLocaleString(Js::YMD *pymd, TZD *ptzd, DateTimeFlag noDateTime,ScriptContext* scriptContext)
I'm just beginning to realize how complicated chakra can be.
you might want to break at JavascriptDate::EntryToLocaleDateString and continue from there.
According to the ECMAScript spec, the string returned by Date.prototype.toLocaleString
needs to be in a human-readable form, but the details are left to the implementation. We insert the BiDi control characters (e.g., LEFT-TO-RIGHT MARK (U+200E)) in order to preserve the order of characters when the string is shown in a UI element.
There is an ongoing discussion on how to align implementations and standardize the use of BiDi control characters in the spec. We've decided to postpone the work on this issue until an agreement is reached as part of the standardization discussion.
@nojvek, if I understand correctly, you're trying to get the localized time string. Would Date.prototype.toLocaleTimeString
help you achieve that?
@bterlson Any word on the TC39 Consensus aspect of this issue?
@dilijev No word, no progress, and it seems unlikely to make progress from my conversations with people.
https://developer.microsoft.com/en-us/microsoft-edge/platform/issues/11907540/ is a duplicate of this bug.
I've created a test example to address possibly related issues:
const dtOptions = {
timeZone: "America/New_York",
timeZoneName: "short",
hour12: false,
};
function dx() {
try {
console.log(`Version ${JSON.stringify(process.versions)}`);
const dt = new Date();
//const dsa = dt.toLocaleString('en-US', dtOptions);
const dsa = dt.toLocaleString('en-US', {hour12:false});
console.log(dsa.split('').map(d => d.charCodeAt(0).toString(16)).join(','));
const ds = dsa.replace(/\u200e/g,""); // work-around
console.log(ds.split('').map(d => d.charCodeAt(0).toString(16)).join(','));
console.log(`Date ${ds}`);
const dm = ds.match(/^(\d+)\/(\d+)\/(\d+),? ([\d:]+)/);
const dd = {y: dm[3], m: dm[1], d: dm[2], hms: dm[4]}
return dd;
}
catch(e) {
console.log(`Error ${e}`);
debugger;
}
}
console.log(dx());
toLocaleString puts a comma after the date in the Nodejs.org version but the Chakra version does not.
I see that there's a comma between the date and the time. Also allowed per spec, but the difference is noted.
> eshost -is test.js
## Source
const dtOptionsChakraBug = {
timeZone: "America/New_York",
timeZoneName: "short",
hour12: false,
};
const dtOptionsSimple = {
hour12: false,
};
function dx(dtOptions) {
try {
// print(`Version ${JSON.stringify(process.versions)}`);
const dt = new Date();
const dsa = dt.toLocaleString('en-US', dtOptions);
print(dsa.split('').map(d => d.charCodeAt(0).toString(16)).join(','));
const ds = dsa.replace(/\u200e/g,""); // work-around
print(ds.split('').map(d => d.charCodeAt(0).toString(16)).join(','));
print(`Date ${ds}`);
const dm = ds.match(/^(\d+)\/(\d+)\/(\d+),? ([\d:]+)/);
const dd = {y: dm[3], m: dm[1], d: dm[2], hms: dm[4]}
return dd;
}
catch(e) {
print(`Error ${e}`);
return null;
}
}
print(JSON.stringify(dx(dtOptionsSimple)));
// print('\ndtOptionsChakraBug');
// print(JSON.stringify(dx(dtOptionsChakraBug)));
#### d8, sm
36,2f,32,2f,32,30,31,37,2c,20,31,37,3a,32,38,3a,32,37
36,2f,32,2f,32,30,31,37,2c,20,31,37,3a,32,38,3a,32,37
Date 6/2/2017, 17:28:27
{"y":"2017","m":"6","d":"2","hms":"17:28:27"}
#### ch-1.2.3, ch-1.3.2, ch-1.5.0, ch-1.4.4, ch-dev-rel, ch-master, ch-dev-dbg
200e,36,200e,2f,200e,32,200e,2f,200e,32,30,31,37,200e,20,200e,31,37,200e,3a,200e,32,38,200e,3a,200e,32,37
36,2f,32,2f,32,30,31,37,20,31,37,3a,32,38,3a,32,37
Date 6/2/2017 17:28:27
{"y":"2017","m":"6","d":"2","hms":"17:28:27"}
#### node
36,2f,32,2f,32,30,31,37,2c,20,31,37,3a,32,38,3a,32,37
36,2f,32,2f,32,30,31,37,2c,20,31,37,3a,32,38,3a,32,37
Date 6/2/2017, 17:28:27
{"y":"2017","m":"6","d":"2","hms":"17:28:27"}
#### node-ch
200e,36,200e,2f,200e,32,200e,2f,200e,32,30,31,37,200e,20,200e,31,37,200e,3a,200e,32,38,200e,3a,200e,32,37
36,2f,32,2f,32,30,31,37,20,31,37,3a,32,38,3a,32,37
Date 6/2/2017 17:28:27
{"y":"2017","m":"6","d":"2","hms":"17:28:27"}
if I use dtOptions with timeZone nodejs accepts it and ignore want is doesn't like but Chakra gets an error. For my purpose {hour12:false} is sufficient so I didn't try to fully debug it.
Already tracked in #3096
Tracking as part of #3644
I'm not sure if this should be fixed FWTW. Intl operations are by design supposed to be opaque and no expectations should be made as to the output.
Bidi marks are perfectly valid parts of the result string and any operations that are attempting to retrieve data from the formatted string (like the regexp in comment 0) are working against the logic of Intl operations.
My recommendation would be to close this issue as invalid.
Sure, but it's horribly inconsistent between browsers. Regex on dates is something that developers do all the time. Weird hidden characters have definitely driven me away from Edge.
On Wed, Dec 20, 2017 at 1:46 PM, Zibi Braniecki notifications@github.com wrote:
I'm not sure if this should be fixed FWTW. Intl operations are by design supposed to be opaque and no expectations should be made as to the output.
Bidi marks are perfectly valid parts of the result string and any operations that are attempting to retrieve data from the formatted string (like the regexp in comment 0) are working against the logic of Intl operations.
My recommendation would be to close this issue as invalid.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Microsoft/ChakraCore/issues/599#issuecomment-353191462, or mute the thread https://github.com/notifications/unsubscribe-auth/AA-JVP1TLEimqutVrwnC58T5i1-2qGECks5tCYAegaJpZM4HzYLg .
They're only weird if you are unfamiliar with the issues of bidirectionality. Bidi chars are needed and we will be adding them. "Developers" also tend to do other bad practices like sync io. I don't believe that ignorance in the subject someone is working on is a good reason to sacrifice the quality of the API.
At this point I'm with keeping the bidi because parsing the output has enough other problems and focusing on TC39 date improvements to get pat today's kludges.