messageformat / parser

A PEG.js parser for ICU MessageFormat strings
https://messageformat.github.io/
MIT License
7 stars 5 forks source link

argStyleText is not supported #1

Closed nkovacs closed 6 years ago

nkovacs commented 7 years ago

The second string is valid according to ICU, but can't be parsed by messageformat-parser:

var parse = require('messageformat-parser').parse;

console.log(parse("{0,date,long}"));
console.log(parse("{0,date,y-M-d HH:mm:ss zzzz}"));

I know messageformat.js doesn't support the second one, but I'm using only the parser for globalize.js, which would be able to support the second one.

nkovacs commented 7 years ago

This is also valid in ICU:

{var,date,y-M-d,HH:mm:ss,zzzz}

In this case, argStyle should be y-M-d,HH:mm:ss,zzzz (there can only be one argStyle), but messageformat-parser instead parses it as three separate arguments.

eemeli commented 6 years ago

Re-opening this, because tbh we should drop the current default .split(',').trim() processing, and assume that the current strictFunctionParams is always true. That would make us more spec-compliant, let us get rid of an option, and make issues like messageformat/messageformat.js#185 much easier to solve. It should be up to the called function to handle the string input it receives.

One open question here is whether the output should be trimmed of white space. The spec is unclear on this. I think we should trim, as that would match what we're doing everywhere else as well.

SlexAxton commented 6 years ago

Agree on trim. On Sat, Mar 17, 2018 at 8:42 AM Eemeli Aro notifications@github.com wrote:

Re-opening this, because tbh we should drop the current default .split(',').trim() processing, and assume that the current strictFunctionParams is always true. That would make us more spec-compliant, let us get rid of an option, and make issues like messageformat/messageformat.js#185 https://github.com/messageformat/messageformat.js/issues/185 much easier to solve. It should be up to the called function to handle the string input it receives.

One open question here is whether the output should be trimmed of white space. The spec is unclear on this. I think we should trim, as that would match what we're doing everywhere else as well.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/messageformat/parser/issues/1#issuecomment-373921034, or mute the thread https://github.com/notifications/unsubscribe-auth/AAF5KoeGX7fLsK86AzWZoWof0rX5cz_Cks5tfRLbgaJpZM4NypE9 .

eemeli commented 6 years ago

I left the messageformat-parser output untrimmed for now, but added a trimmer in the messageformat compiler. This way if someone else wants to use the parser, they get to make up their own mind on this, but our formatters don't need to all trim their inputs separately.

mhuebert commented 6 years ago

Overall this change made my life easier, thanks!

One question; since the string is trimmed, is there a way to escape leading or trailing space characters to have them included? I don't see any mention of this in the formatting docs.

SlexAxton commented 6 years ago

Personally if leading/trailing space is significant, I add it outside of the translated message since a translator will have no idea anyways. On Thu, Aug 16, 2018 at 3:28 PM Matt Huebert notifications@github.com wrote:

Overall this change made my life easier, thanks!

One question; since the string is trimmed, is there a way to escape leading or trailing space characters to have them included? I don't see any mention of this in the formatting docs.

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/messageformat/parser/issues/1#issuecomment-413703518, or mute the thread https://github.com/notifications/unsubscribe-auth/AAF5Km9gKZ5NszM_Ds9TgjEnjtmFTLiRks5uRfHxgaJpZM4NypE9 .