formatjs / formatjs-old

The monorepo home to all of the FormatJS related libraries.
https://formatjs.io/
156 stars 53 forks source link

FR: Ability to retain whitespace in AST #75

Closed gadicc closed 5 years ago

gadicc commented 9 years ago

Hi there, I'm the author of meteor-messageformat which is using intl-messageformat in it's next iteration. Some of our users would like the ability to retain whitespace in the final translated string, which, for example, might then be parsed later by markdown.

Would this be possible? I'm assuming this isn't a feature that will be desired most of the time, therefore I'm guessing the biggest job would be to allow intl-messageformat to accept options that could be parsed down to the parser. What are your thoughts?

caridy commented 9 years ago

I don't think ICU cares about the spaces in the message, maybe @SlexAxton or @ericf will know better. Our current parser (written by @ericf) might be doing some trimming. PR #8 is suppose to revamp the parser, but we haven't get a chance to get to that.

SlexAxton commented 9 years ago

ICU cares about the spaces in the output-text portion of the full message, but not in the control statements. (I don't have proof/spec-quotation, but that's been my observation)

On Wed, Jul 8, 2015, 12:01 PM Caridy Patiño notifications@github.com wrote:

I don't think ICU cares about the spaces in the message, maybe @SlexAxton https://github.com/SlexAxton or @ericf https://github.com/ericf will know better. Our current parser (written by @ericf https://github.com/ericf) might be doing some trimming. PR #8 https://github.com/yahoo/intl-messageformat-parser/pull/8 is suppose to revamp the parser, but we haven't get a chance to get to that.

— Reply to this email directly or view it on GitHub https://github.com/yahoo/intl-messageformat-parser/issues/10#issuecomment-119662689 .

ericf commented 9 years ago

ICU cares about the spaces in the output-text portion of the full message, but not in the control statements. (I don't have proof/spec-quotation, but that's been my observation)

I remember reading this somewhere as well — which is why it's ignored. There's no whitespace data in the current AST format, but maybe you could write a pretty printer which follows some default formatting rules, like this for plurals:

You have {numMessages, plural,
  one {# message}
  other {# messages}
} in your inbox.
ericf commented 9 years ago

Here's a basic AST --> ICU Message string printer I happened to write yesterday if you'd like it as a starting point:

export default function printICUMessage(ast) {
    return ast.elements.reduce((message, el) => {
        let {format, id, type, value} = el;

        if (type === 'messageTextElement') {
            return message + value;
        }

        if (!format) {
            return message + `{${id}}`;
        }

        let formatType = format.type.replace(/Format$/, '');

        switch (formatType) {
            case 'number':
            case 'date':
            case 'time':
                let style = format.style ? `, ${format.style}` : '';
                return message + `{${id}, ${formatType}${style}}`;

            case 'plural':
            case 'selectOrdinal':
            case 'select':
                let offset = format.offset ? `, offset:${format.offset}` : '';
                let options = format.options.reduce((str, option) => {
                    let optionValue = printICUMessage(option.value);
                    return str + ` ${option.selector} {${optionValue}}`;
                }, '');
                return message + `{${id}, ${formatType}${offset},${options}}`;
        }
    }, '');
}