Closed arm1n closed 2 years ago
Hey Armin, could you provide some code examples for the two cases that you mention? Thanks!
Of course - so I'm using the HTML parsers to grab messages from views via a dedicated <FormattedMessage>
component:
<FormattedMessage
message="
{{count}}
multiline
plural
message.
"
plural="
{{count}}
multiline
plural
messages.
"
count={2}
/>;
Even if I could use HtmlExtractors.elementContent
for the singular message and have whitespace, trim and indentation control, I'd be missing it for plural message, which always has to be an attribute. As such indentations aren't being rendered as such, devs in my company tend to write long messages with these line breaks, but I need to make sure that they're ending up without indentations and line-breaks in our translation tool.
The second thing is also somehow annoying as some devs are also using something like:
<FormattedMessage message={'Message'} />
<FormattedMessage message={'Message: with whitespace'} />
Such markup, which is valid JSX, ends up in the PO files as msgid "{'Message'}"
and are even broken if it contains white space: msgid "{'Message"
.
Now I know that it's a HTML parser, for this reason some sanitization callback for the extractors as option would be helpful to do some RegExp work on it, maybe something like { processValue: (value: string) => value }
as content option, what do you think?
@lukasgeiter ping :) I don't want to stress you, I just want to know how to proceed - when you're willing to accept changes I'd try to work on a PR, if not, I'll go with a custom extractor. Thanks for your response!
Thanks for the details and apologies for my late response.
I agree that it would make sense to have content options for attributes as well. Feel free to implement this and open a PR 🙂
The second issue I would rather address by properly supporting JSX. Using the HTML parser for JSX is always going to cause issues like this and if you want to work around them I suggest you do so using a custom extractor.
Thanks for the quick reply - okay then I would try to incorporate the content options into the HtmlExtractors.elementAttribute
and also HtmlExtractors.elementContent
(plural) and draft a PR.
I totally agree that it would be best to have a dedicated JSX extractor to avoid such problems. So if I understand you correctly you have plans to do so in the future? I can circumvent the problem above by having an eslint rule disallowing expressions for string literals only, but then I don't have to go the custom extractor route, especially when you plan to have one in the future :)
I would like to implement a JSX extractor at some point (along with many other improvements). That said, I'm currently pretty busy with other things in my life so I wouldn't hold my breath.
Released with v3.6.0
Hi Lukas,
first of all thanks for your greate piece of software, it works like a charm. I'd like to suggest one enhancement, which would avoid the necessity for custom extractors, if it were built into the current implementation - especially because all the required tools are already there, as it's used
HtmlExtractors.elementContent
.Would it be possible to perform content normalization on extracted attribute values as well? Even though if I'm working around the missing support in
HtmlExtractors.elementAttribute
by usingHtmlExtractors.elementContent
, I'm still facing the missing content options when dealing withtextPlural
attribute. For this reason it would be great to have this options there as well.Furthermore, some possibility of content sanitization would be useful as well. I had the case where JSX expressions in attributes are ending up as
{'Message'}
in the POT string. Offering some kind of callback in the extractor options would provide the flexibility to act on such cases.What's your take on that? Thanks in advance!