Open kavimuru opened 1 year ago
Confirmed that this happens on Android native app.
Confirmed it does not happen on Mac/Chrome. But related, when I try the same reproduction steps on iOS native, iOS Chrome and Mac Safari - I cannot italicize emojis at all. Is that expected?
Job added to Upwork: https://www.upwork.com/jobs/~017d08e0c09284d31f
Current assignee @anmurali is eligible for the External assigner, not assigning anyone new.
Triggered auto assignment to Contributor-plus team member for initial proposal review - @aimane-chnaif (External
)
Triggered auto assignment to @flodnv (External
), see https://stackoverflow.com/c/expensify/questions/7972 for more details.
Quick fix:
diff --git a/src/styles/styles.js b/src/styles/styles.js
index b826ed336f..84286591e9 100644
--- a/src/styles/styles.js
+++ b/src/styles/styles.js
@@ -63,6 +63,7 @@ const webViewStyles = {
em: {
fontFamily: fontFamily.EXP_NEUE,
fontStyle: 'italic',
+ width: variables.fontSizeNormalHeight,
},
del: {
I don't think we should be able to italicize emojis, asked here: https://expensify.slack.com/archives/C01GTK53T8Q/p1675266083150699
Agree with @flodnv.
The Emoji is italicized because it is contained in the <em></em>
tag (same as when we make a text italic). We can observe the behavior on both Chrome web and Android.
In order to remove the ability to italicize emojis, I'll wrap any emoji that are inside the <em></em>
with a expensify-emoji
custom HTML tag, and give it a fontStyle: 'normal'
style. This is the same way that Github does it.
(The emoji regex below is taken from the CONST.REGEX.EMOJIS
in Expensify app)
In https://github.com/Expensify/expensify-common
diff --git a/lib/ExpensiMark.js b/lib/ExpensiMark.js
index 7f94ffe..057e87c 100644
--- a/lib/ExpensiMark.js
+++ b/lib/ExpensiMark.js
@@ -134,6 +134,11 @@ export default class ExpensiMark {
regex: /(?!_blank")\b_((?!\s).*?\S)_\b(?![^<]*(<\/pre>|<\/code>|<\/a>|_blank))/g,
replacement: '<em>$1</em>',
},
+ {
+ name: 'emoji',
+ regex: /(<em>.*?)([\p{Extended_Pictographic}\u200d\u{1f1e6}-\u{1f1ff}\u{1f3fb}-\u{1f3ff}\u{e0020}-\u{e007f}\u20E3\uFE0F]|[#*0-9]\uFE0F?\u20E3)(.*?<\/em>)/gu,
+ replacement: '$1<expensify-emoji>$2</expensify-emoji>$3',
+ },
{
// Use \B in this case because \b doesn't match * or ~.
// \B will match everything that \b doesn't, so it works
In https://github.com/Expensify/App
diff --git a/src/components/HTMLEngineProvider/BaseHTMLEngineProvider.js b/src/components/HTMLEngineProvider/BaseHTMLEngineProvider.js
index 456cd1ecf3..0bf75ea9d2 100755
--- a/src/components/HTMLEngineProvider/BaseHTMLEngineProvider.js
+++ b/src/components/HTMLEngineProvider/BaseHTMLEngineProvider.js
@@ -39,6 +39,10 @@ const customHTMLElementModels = {
tagName: 'email-comment',
mixedUAStyles: {whiteSpace: 'normal'},
}),
+ 'expensify-emoji': defaultHTMLElementModels.span.extend({
+ tagName: 'expensify-emoji',
+ mixedUAStyles: {fontStyle: 'normal'},
+ }),
};
const defaultViewProps = {style: [styles.alignItemsStart, styles.userSelectText]};
There're small possible variations based on our preference:
expensify-emoji
tag but just use a span
tag with a style. I prefer using a custom tag to make it clearThere're some further minor fixes required:
em
, no other text, the emoji will not be italicized but it will also not have the onlyEmojisText
style, we can fix that with another regex.Expensify-emoji
tagI'll add those if we want to go with this direction.
Pls note we have to fix the same in the BE as well, in order to test the above fix we need to enable offline mode.
working well after the fix:
@tienifr Your approach looks solid however the regex is not correct as it will only match one emoji, try sending hi 😁 there 😁
, it will wrap the first emoji but not the second one. Also it will only work with em
tags and I think it would be better to wrap the emojis all the time (this will also cover the bold case).
Here are my suggestions:
{
name: 'emoji',
regex: /[\p{Extended_Pictographic}\u200d\u{1f1e6}-\u{1f1ff}\u{1f3fb}-\u{1f3ff}\u{e0020}-\u{e007f}\u20E3\uFE0F]|[#*0-9]\uFE0F?\u20E3/gu,
replacement: match => `<expensify-emoji>${match}</expensify-emoji>`,
},
2. Move the regex `emoji` to be before `italic` and after `autolink` (Not necessary, just a personal opinion as I like to preserve the order of italic, bold, strikethrough regex rules)
3. In `mixedUAStyles: {fontStyle: 'normal'}` add `fontWeight: 'normal'`
Thanks @s77rt for your suggestion!
Adding to the original proposal, Here's the my updated diff for the regex so that it matches all rather than just one.
diff --git a/lib/ExpensiMark.js b/lib/ExpensiMark.js
index c535048..62b183f 100644
--- a/lib/ExpensiMark.js
+++ b/lib/ExpensiMark.js
@@ -242,6 +242,11 @@ export default class ExpensiMark {
regex: /<(em|i)(?:"[^"]*"|'[^']*'|[^'">])*>(.*?)<\/\1>(?![^<]*(<\/pre>|<\/code>))/gi,
replacement: '_$2_',
},
+ {
+ name: 'emoji',
+ regex: /([\p{Extended_Pictographic}\u200d\u{1f1e6}-\u{1f1ff}\u{1f3fb}-\u{1f3ff}\u{e0020}-\u{e007f}\u20E3\uFE0F]|[#*0-9]\uFE0F?\u20E3)/gu,
+ replacement: '<expensify-emoji>$1</expensify-emoji>',
+ },
{
name: 'bold',
regex: /<(b|strong)(?:"[^"]*"|'[^']*'|[^'">])*>([\s\S]*?)<\/\1>(?![^<]*(<\/pre>|<\/code>))/gi,
@s77rt Regarding the fontWeight: 'normal'
, I don't feel strongly about since it doesn't seem that the bold fontWeight affects how the emoji looks.
@tienifr
fontWeight: 'normal'
ensures we are rendering the emoji correctly@s77rt I tried to zoom the emojis in but can't recognize any difference between an emoji that has fontWeight: 'bold'
and a normal one. But we surely can add that if we want to be extra safe in that bold case (Github also does it).
Thanks for the input!
diff --git a/src/components/HTMLEngineProvider/BaseHTMLEngineProvider.js b/src/components/HTMLEngineProvider/BaseHTMLEngineProvider.js
index 456cd1ecf3..0bf75ea9d2 100755
--- a/src/components/HTMLEngineProvider/BaseHTMLEngineProvider.js
+++ b/src/components/HTMLEngineProvider/BaseHTMLEngineProvider.js
@@ -39,6 +39,10 @@ const customHTMLElementModels = {
tagName: 'email-comment',
mixedUAStyles: {whiteSpace: 'normal'},
}),
+ 'expensify-emoji': defaultHTMLElementModels.span.extend({
+ tagName: 'expensify-emoji',
+ mixedUAStyles: {fontStyle: 'normal', fontWeight: 'normal'},
+ }),
};
const defaultViewProps = {style: [styles.alignItemsStart, styles.userSelectText]};
@tienifr While seaching I found this image.
The middle emoji is in bold but it's hard to notice also it may depend on the font. Adding fontWeight: 'normal'
is more a safety step.
@flodnv, @anmurali, @aimane-chnaif Eep! 4 days overdue now. Issues have feelings too...
@flodnv, @anmurali, @aimane-chnaif Huh... This is 4 days overdue. Who can take care of this?
reviewing proposals shortly
I was trying to find a solution for this and stumbled upon this issue. I would like to propose an alternative solution that would be useful later.
I agree with the above proposal to enclose all emoji with a custom tag to apply a custom style later on.
But the first thing to do is to change the emoji regex.
I previously worked on this issue by providing a new regex and it works well for that case. However, when we use that regex to match composite emojis, for example 👨👨👦, it will produce this array of string:
['👨', '', '👨', '', '👦']
The empty string is zero-width joiner. Again, this works well for the case I mentioned from the previous paragraph (we only care what the string contains). But for this case, we need to replace the emoji and enclose it with a tag. If we use the current emoji regex, it will enclose each part of the emoji with the tag, instead of the composited emoji.
Expected:
<emoji>👨👨👦</emoji>
Actual:
<emoji>👨</emoji><emoji></emoji><emoji>👨</emoji> and so on...
So, we need a better regex to capture the composite emoji, below is the new regex.
diff --git a/src/CONST.js b/src/CONST.js
index ae3ffca7f..8399de047 100755
--- a/src/CONST.js
+++ b/src/CONST.js
@@ -790,7 +790,7 @@ const CONST = {
WEBSITE: /^((https?|ftp):\/\/)(([a-z\d]([a-z\d-]*[a-z\d])*)\.)+[a-z]{2,}(:\d+)?(\/[-a-z\d%_.~+]*)*(\?[;&a-z\d%_.~+=-]*)?(#[-a-z\d_]*)?$/i,
// eslint-disable-next-line max-len, no-misleading-character-class
- EMOJIS: /[\p{Extended_Pictographic}\u200d\u{1f1e6}-\u{1f1ff}\u{1f3fb}-\u{1f3ff}\u{e0020}-\u{e007f}\u20E3\uFE0F]|[#*0-9]\uFE0F?\u20E3/gu,
+ EMOJIS: /[\p{Extended_Pictographic}](\u200D[\p{Extended_Pictographic}]|[\u{1F3FB}-\u{1F3FF}]|[\u{E0020}-\u{E007F}]|\uFE0F|\u20E3)*|[\u{1F1E6}-\u{1F1FF}]{2}|[#*0-9]\uFE0F?\u20E3/gu,
TAX_ID: /^\d{9}$/,
NON_NUMERIC: /\D/g,
EMOJI_NAME: /:[\w+-]+:/g,
I have tested the new regex against all emojis from assets/emojis
file and make sure all EmojiTest.js
test cases pass.
Here is how I test it:
diff --git a/tests/unit/EmojiTest.js b/tests/unit/EmojiTest.js
index 25626872c..b68bfbca0 100644
--- a/tests/unit/EmojiTest.js
+++ b/tests/unit/EmojiTest.js
@@ -1,5 +1,6 @@
import _ from 'underscore';
import Emoji from '../../assets/emojis';
+import CONST from '../../src/CONST';
import * as EmojiUtils from '../../src/libs/EmojiUtils';
describe('EmojiTest', () => {
@@ -11,12 +12,16 @@ describe('EmojiTest', () => {
}
// When we match every Emoji Code
- const isEmojiMatched = EmojiUtils.containsOnlyEmojis(emoji.code);
+ const isEmojiMatched = EmojiUtils.containsOnlyEmojis(emoji.code)
+ && emoji.code.match(CONST.REGEX.EMOJIS).length === 1;
let skinToneMatched = true;
if (emoji.types) {
// and every skin tone variant of the Emoji code
- skinToneMatched = _.every(emoji.types, emojiWithSkinTone => EmojiUtils.containsOnlyEmojis(emojiWithSkinTone));
+ skinToneMatched = _.every(emoji.types, emojiWithSkinTone => {
+ return EmojiUtils.containsOnlyEmojis(emojiWithSkinTone)
+ && emoji.code.match(CONST.REGEX.EMOJIS).length === 1;
+ });
}
return skinToneMatched && isEmojiMatched;
});
Here is the code part to enclose the emoji with
diff --git a/src/pages/home/report/ReportActionItemFragment.js b/src/pages/home/report/ReportActionItemFragment.js
index c4f593517..4f34733cc 100644
--- a/src/pages/home/report/ReportActionItemFragment.js
+++ b/src/pages/home/report/ReportActionItemFragment.js
@@ -108,7 +108,8 @@ const ReportActionItemFragment = (props) => {
// Only render HTML if we have html in the fragment
if (!differByLineBreaksOnly) {
const editedTag = props.fragment.isEdited ? '<edited></edited>' : '';
- const htmlContent = html + editedTag;
+ const htmlContent = EmojiUtils.encloseEmojisWithTag(html, 'emoji') + editedTag;
+
return (
<RenderHTML
html={props.source === 'email'
diff --git a/src/libs/EmojiUtils.js b/src/libs/EmojiUtils.js
index d6ed8875e..dce4ad93c 100644
--- a/src/libs/EmojiUtils.js
+++ b/src/libs/EmojiUtils.js
@@ -245,6 +245,10 @@ function suggestEmojis(text, limit = 5) {
return [];
}
+function encloseEmojisWithTag(text, tag) {
+ return Str.replaceAll(text, CONST.REGEX.EMOJIS, g => `<${tag}>${g}</${tag}>`);
+}
+
export {
getDynamicHeaderIndices,
mergeEmojisWithFrequentlyUsedEmojis,
@@ -253,4 +257,5 @@ export {
replaceEmojis,
suggestEmojis,
trimEmojiUnicode,
+ encloseEmojisWithTag,
};
Now, apply the style to the custom tag
diff --git a/src/styles/styles.js b/src/styles/styles.js
index 2b2a3498d..8e7d163af 100644
--- a/src/styles/styles.js
+++ b/src/styles/styles.js
@@ -1065,6 +1065,11 @@ const styles = {
textDecorationLine: 'none',
},
+ emojisText: {
+ fontStyle: 'normal',
+ fontWeight: 'normal',
+ },
+
onlyEmojisText: {
fontSize: variables.fontSizeOnlyEmojis,
lineHeight: variables.fontSizeOnlyEmojisHeight,
diff --git a/src/components/HTMLEngineProvider/BaseHTMLEngineProvider.js b/src/components/HTMLEngineProvider/BaseHTMLEngineProvider.js
index 456cd1ecf..fcfe6b809 100755
--- a/src/components/HTMLEngineProvider/BaseHTMLEngineProvider.js
+++ b/src/components/HTMLEngineProvider/BaseHTMLEngineProvider.js
@@ -39,6 +39,10 @@ const customHTMLElementModels = {
tagName: 'email-comment',
mixedUAStyles: {whiteSpace: 'normal'},
}),
+ 'emoji': defaultHTMLElementModels.span.extend({
+ tagName: 'emoji',
+ mixedUAStyles: styles.emojisText,
+ }),
};
const defaultViewProps = {style: [styles.alignItemsStart, styles.userSelectText]};
The last problem is when we type emojis inside code (`) tag, the text does not show, for example
😄. The text will be converted to
. We have a custom rendered for code tag and it's only taking a text inside the code tag, but the emoji is enclosed with
emoji` tag. The solution is to get the text inside of it.
The getEmojiTextInsideCode will iterate over the children inside the code tag and get the data (text) of it. For example, if we have this text a😄
, the children will be <TText data="a" />
and <TText data="😄" />
.
diff --git a/src/components/HTMLEngineProvider/htmlEngineUtils.js b/src/components/HTMLEngineProvider/htmlEngineUtils.js
index 71a5e9267..fb588ff08 100644
--- a/src/components/HTMLEngineProvider/htmlEngineUtils.js
+++ b/src/components/HTMLEngineProvider/htmlEngineUtils.js
@@ -44,8 +44,13 @@ function isInsideComment(tnode) {
return false;
}
+function getEmojiTextInsideCode(tnode) {
+ return tnode.children.reduce((prev, curr) => prev + curr.data, '');
+}
+
export {
computeEmbeddedMaxWidth,
isInsideComment,
isCommentTag,
+ getEmojiTextInsideCode,
};
diff --git a/src/components/InlineCodeBlock/index.js b/src/components/InlineCodeBlock/index.js
index af33b4468..222b7d877 100644
--- a/src/components/InlineCodeBlock/index.js
+++ b/src/components/InlineCodeBlock/index.js
@@ -1,6 +1,7 @@
import React from 'react';
import inlineCodeBlockPropTypes from './inlineCodeBlockPropTypes';
import Text from '../Text';
+import * as HTMLEngineUtils from '../HTMLEngineProvider/htmlEngineUtils';
const InlineCodeBlock = (props) => {
const TDefaultRenderer = props.TDefaultRenderer;
@@ -12,7 +13,9 @@ const InlineCodeBlock = (props) => {
<Text
style={{...props.boxModelStyle, ...props.textStyle}}
>
- {props.defaultRendererProps.tnode.data}
+ {props.defaultRendererProps.tnode.data
+ ? props.defaultRendererProps.tnode.data
+ : HTMLEngineUtils.getEmojiTextInsideCode(props.defaultRendererProps.tnode)}
</Text>
</TDefaultRenderer>
);
diff --git a/src/components/InlineCodeBlock/index.native.js b/src/components/InlineCodeBlock/index.native.js
index 8431ce61e..04adad238 100644
--- a/src/components/InlineCodeBlock/index.native.js
+++ b/src/components/InlineCodeBlock/index.native.js
@@ -2,6 +2,7 @@ import React from 'react';
import styles from '../../styles/styles';
import WrappedText from './WrappedText';
import inlineCodeBlockPropTypes from './inlineCodeBlockPropTypes';
+import * as HTMLEngineUtils from '../HTMLEngineProvider/htmlEngineUtils';
const InlineCodeBlock = (props) => {
const TDefaultRenderer = props.TDefaultRenderer;
@@ -17,7 +18,9 @@ const InlineCodeBlock = (props) => {
styles.codeWordStyle,
]}
>
- {props.defaultRendererProps.tnode.data}
+ {props.defaultRendererProps.tnode.data
+ ? props.defaultRendererProps.tnode.data
+ : HTMLEngineUtils.getEmojiTextInsideCode(props.defaultRendererProps.tnode)}
</WrappedText>
</TDefaultRenderer>
);
Finally, I thought this improvement should be held waiting for this issue.
thanks @bernhardoj for the additional insights 💪
I agree there’ll be some fixes here and there to make sure it works 100% in all cases, some of which were mentioned earlier and some you mentioned above.
Once the direction is confirmed we can dive deeper and resolve the remaining (potential) issues.
@bernhardoj in the case of the 👨👨👦 emoji, it's indeed split into sub-emojis but it still renders correctly on the screen (since the sub-emojis are in the consecutive span tags and still render the correct composite emoji).
Just curious if you find any case of an emoji that has user-visible problem if it's not matched by your updated regex?
Surprisingly it worked on web, but that's not the case for native platform.
opened internal discussion here for ideal solution
@flodnv @anmurali @aimane-chnaif this issue was created 2 weeks ago. Are we close to approving a proposal? If not, what's blocking us from getting this issue assigned? Don't hesitate to create a thread in #expensify-open-source to align faster in real time. Thanks!
@flodnv @anmurali @aimane-chnaif this issue was created 2 weeks ago. Are we close to approving a proposal? If not, what's blocking us from getting this issue assigned? Don't hesitate to create a thread in #expensify-open-source to align faster in real time. Thanks!
Upwork job price has been updated to $2000
@flodnv, @anmurali, @aimane-chnaif Whoops! This issue is 2 days overdue. Let's get this updated quick!
We're still discussing with @aimane-chnaif and @rushatgabhane
No update over the weekend.
@flodnv @anmurali @aimane-chnaif this issue is now 3 weeks old. There is one more week left before this issue breaks WAQ and will need to go internal. What needs to happen to get a PR in review this week? Please create a thread in #expensify-open-source to discuss. Thanks!
@aimane-chnaif bump to resolve the slack conversation https://expensify.slack.com/archives/C02NK2DQWUX/p1676299667312469
still discussing in slack
We can add Help Wanted
label back since no acceptable proposal yet. We're still looking for best solution.
cc: @flodnv @anmurali
We're still looking for best solution.
I don't have access to internal discussion. If you are still waiting for alternative solutions, could you please explain the point of discussion briefly? It seems that @bernhardoj 's soultion fits - it solves this problem and provides solution for another issue. Or it's just doesn't work on native?
We want add a markdown rule that removes emojis from em
tag
Markdown: _hello🙂world!_
Current html: <em>hello🙂world!</em>
Target html: <em>hello</em>🙂<em>world!</em>
This should be clear solution and won't require any new tags or backend updates. We're going to award this job to someone who adds this rule in markdown->html level.
Emoji has italic style if put in _
_
in markdown.
The emoji inside <em>
tag in html has italic style.
As mentioned by @aimane-chnaif above, we can update the italic rule to exclude the emoji from the em
tag. We can do that by adding a post
function in the current italic rule:
post: (text) => {
return text.replace(/(([\p{Extended_Pictographic}](\u200D[\p{Extended_Pictographic}]|[\u{1F3FB}-\u{1F3FF}]|[\u{E0020}-\u{E007F}]|\uFE0F|\u20E3)*|[\u{1F1E6}-\u{1F1FF}]{2}|[#*0-9]\uFE0F?\u20E3)+)/gu, '</em>$1<em>')
}
This will replace all emoji group inside the <em>
, </em>
by '</em>$1<em>'
, which will achieve the desired behavior.
We can further refine this by removing redundant <em></em>
that doesn't have any content inside. This can happen if an emoji happens to be right next to the leading and trailing _
.
There'll be further changes needed for other conversion rules (for example html to markdown), but as mentioned above we only need the rule in markdown->html level
first.
Another way is instead of the regex above to handle the emojis, we can have a regex to replace all non-emoji character group
with <em>$1</em>
(the <em>
or </em>
will be omitted if the group is next to the start or end of the tag)
We can further refine this by removing redundant that doesn't have any content inside.
@tienifr I need this solution as well.
For now, your regex doesn't work for me at all. Please try with this:
💯💜🥳
_💯💜🥳_
_💯_ _💜_ _🥳_
Expected result should show like this:
After converting to:
💯💜🥳
💯💜🥳
💯 💜 🥳
@tienifr I need this solution as well.
@aimane-chnaif sure.
This rule in https://github.com/Expensify/expensify-common/blob/f2fb81231b2a6dbf141fe8b721d779b67c8a9c37/lib/ExpensiMark.js#L134 will work for all the cases you mentioned:
{
name: 'italic',
regex: /(?!_blank")\b_((?!\s)[\s\S]*?\S)_\b(?![^<]*(<\/pre>|<\/code>|<\/a>|_blank))/g,
replacement: (match, g1) => {
const replacedText = `<em>${g1}</em>`;
const textWithEmojiExcluded = replacedText.replace(/(([\p{Extended_Pictographic}](\u200D[\p{Extended_Pictographic}]|[\u{1F3FB}-\u{1F3FF}]|[\u{E0020}-\u{E007F}]|\uFE0F|\u20E3)*|[\u{1F1E6}-\u{1F1FF}]{2}|[#*0-9]\uFE0F?\u20E3)+)/gu, '</em>$1<em>');
const textWithEmptyTagRemoved = textWithEmojiExcluded.replace(/<em><\/em>/g, '');
return textWithEmptyTagRemoved;
}
},
(pls note the change in replacement
to handle the replacement steps)
If we do this, can we convert it back to the markdown 100% the same because now we have multiple _
?
For example, we convert from this markdown _a🥳b🥳c_
to the html <em>a</em>🥳<em>b</em>🥳<em>c</em>
. When we convert it back to the html, we will get _a_🥳_b_🥳_c_
.
@bernhardoj thanks for raising this concern.
After discussion, we agreed markdown should always be consistent between new message and editing message.
So it's preferred to fix this in render level. But regarding conversion logic, still suggest to escape emoji from <em>
tag than wrapping with new tag.
Another way is instead of the regex above to handle the emojis, we can have a regex to replace all non-emoji character group with $1 (the or will be omitted if the group is next to the start or end of the tag)
@aimane-chnaif here's another rule for this alternative, at first glance it might look cleaner but it can be more buggy since we're negating an emoji regex and we need to further cover all cases of zero-width string & similar invisible characters.
{
name: 'italic',
regex: /(?!_blank")\b_((?!\s)[\s\S]*?\S)_\b(?![^<]*(<\/pre>|<\/code>|<\/a>|_blank))/g,
replacement: (match, g1) => {
return g1.replace(/(((?!([\p{Extended_Pictographic}](\u200D[\p{Extended_Pictographic}]|[\u{1F3FB}-\u{1F3FF}]|[\u{E0020}-\u{E007F}]|\uFE0F|\u20E3)*|[\u{1F1E6}-\u{1F1FF}]{2}|[#*0-9]\uFE0F?\u20E3)+)[^\u200D])+)/gu, '<em>$1</em>');
}
}
I think we should go with the rule above, but just putting this here as an option to consider.
@tienifr let's say user send 2 new messages _hello_✋_world_
and _hello✋world_
Both are converted to same html: <em>hello</em>✋<em>world</em>
When edit message, we should convert html back to markdown.
How will you know original markdown?
@aimane-chnaif in this case we will need to make the htmlToMarkdown conversion consistent with 1 style of putting the emoji. I think it should be converted to _hello✋world_
in both cases. Both renders the same way for the user, so the redundant wrapping around the emoji is safe to be removed.
@tienifr if users input _hello_✋_world_
, they still wanna see _hello_✋_world_
on edit
@aimane-chnaif it might not be possible since we're not preserving the order of the markdown in the html. Can you discuss in the Slack thread about this issue to see if we change our mind on which way we want to go? (if we use a custom emoji tag like Github does then we'll not have this problem)
it might not be possible since we're not preserving the order of the markdown in the html
To workaround this I can imagine we can put some special properties in the em
tag to indicate this, however it's like a workaround and can be buggy.
@tienifr we already discussed. that's why we suggest to fix this in render level
@rushatgabhane can you please jump in this discussion since you suggested escape approach?
I think all of discussions were based on avoiding backend fix.
We don't want to apply bold and italic style to emoji.
Emoji inside _
and *
markdown will receive bold and italic text style.
/[\p{Extended_Pictographic}](\u200D[\p{Extended_Pictographic}]|[\u{1F3FB}-\u{1F3FF}]|[\u{E0020}-\u{E007F}]|\uFE0F|\u20E3)*|[\u{1F1E6}-\u{1F1FF}]{2}|[#*0-9]\uFE0F?\u20E3/gu
~ The emoji regex is already updated</em|b|s>emoji<em|b|s>
and remove empty tag.
https://github.com/Expensify/App/blob/a4d01a87b242eef06af37db965bd135745e342e5/src/pages/home/report/comment/TextCommentFragment.tsx#L60Details:
We can create a new util function, let's call it unwrapEmojis
.
First, we need a new regex constructed from the emoji tag to match all emojis inside bold, italic, or strikethrough tag.
const emojiTagRegex = /<emoji( islarge)?>.*<\/emoji>/g;
const textWithEmojiRegex = RegExp(`<(b|strong|em|i|s|del)>.*(${emojiTagRegex.source})+.*<\\/\\1>`, emojiTagRegex.flags)
Next, we match the whole text with the new regex and then replace the emojis with the tag mentioned above (step 2).
Last, we clean the empty tag.
It will look like this:
function unwrapEmojis(text) {
const emojiTagRegex = /<emoji( islarge)?>.*<\/emoji>/g;
const textWithEmojiRegex = RegExp(`<(b|strong|em|i|s|del)>.*(${emojiTagRegex.source})+.*<\\/\\1>`, emojiTagRegex.flags)
return text.replace(textWithEmojiRegex, (match, g1) => {
return match.replace(emojiTagRegex, (match) => `</${g1}>${match}<${g1}>`).replace(new RegExp(`<${g1}></${g1}>`, 'g'), '')
});
}
and we can use it like this:
<RenderCommentHTML
source={source}
html={unwrapEmojis(htmlWithTag)}
/>
We will create a test for unwrapEmojis
too, example
expect(unwrapEmojis("<b>test 😄 test</b>")).toBeEqual("<b>test </b>😄<b> test</b>")
@flodnv @anmurali @aimane-chnaif this issue is now 4 weeks old and preventing us from maintaining WAQ, can you:
Thanks!
If you haven’t already, check out our contributing guidelines for onboarding and email contributors@expensify.com to request to join our Slack channel!
Action Performed:
Expected Result:
Users should not be able to format emojis, lets keep them with normal style (no bold, italics, strikethrough etc)
Actual Result:
Users can make emoji italic which leads to the emojis to cut off a few parts of the emoji (for some emojis)
Workaround:
unknown
Platforms:
Which of our officially supported platforms is this issue occurring on?
Version Number: 1.2.62-1 Reproducible in staging?: y Reproducible in production?: y If this was caught during regression testing, add the test name, ID and link from TestRail: Email or phone of affected tester (no customers): Logs: https://stackoverflow.com/c/expensify/questions/4856 Notes/Photos/Videos:
Expensify/Expensify Issue URL: Issue reported by: @priya-zha Slack conversation: https://expensify.slack.com/archives/C049HHMV9SM/p1675094271396099
View all open jobs on GitHub
Upwork Automation - Do Not Edit
Issue Owner
Current Issue Owner: @mallenexpensify / @mallenexpensify