RocketChat / Rocket.Chat

The communications platform that puts data protection first.
https://rocket.chat/
Other
40.44k stars 10.52k forks source link

Broken Formatting: Bold to Bold+Italic, loses underscores if text has specific symbols #30595

Open casalsgh opened 1 year ago

casalsgh commented 1 year ago

Issue Description When customer is sending “ОЛ_1.2_2_00_ЭС” in bold or bold+Italic, the underscore disappears

Steps to Reproduce use bold or bold+ italic when text hes underscore somewhere in the middle, example in the description more details you can find in the discussion on Open server

Expected Behavior Underscores should still be present when text in bold formatting

Actual Behavior underscores disappearing

akshayw1 commented 1 year ago

Pls assign

casalsgh commented 1 year ago

@akshayw1 sorry, you mean assign to you this issue? If you can check it out would be awesome

akshayw1 commented 1 year ago

I mean could I work on to resolve this issue?

casalsgh commented 1 year ago

I mean could I work on to resolve this issue?

for sure, yes! please give it a try =)

Kunalvrm555 commented 1 year ago

Hello @casalsgh,

Upon examining the underscore formatting issue in RC, I offer the following observations:

In Rocket.Chat's Markdown syntax: _text_ or __text__ produces italicized text, while *text* or **text** results in bold text. Given these rules, certain combinations like ОЛ_1.2_ can be misinterpreted by the Markdown parser when formatted in bold/italic. Specifically, ОЛ_1.2_ gets rendered as ОЛ1.2 where "1.2" appears in italics. The underscores, intended for bold/italic delimiters, inadvertently disappear in the process.

image

For comparative analysis, I tested similar markdown syntax on GitHub and noted a different behavior: ОЛ1.2 wrapped with double asterisks (**) results in: ОЛ1.2

hb0789 commented 1 year ago

Hey @casalsgh

I tried reproducing the issue, and found similar results. My guess is that the problem lies in the text formatting engine. Cyrillic characters are probably being interpreted as special characters, hence causing unwanted behavior.

image

As you can see, the English character formatting works just fine, it's only the Cyrillic characters that confuse the text formatter.

Furthermore, I found that ANY special characters cause a similar issue. In this case, I used Greek alphabets

image

image

I'll try and work on a fix.

akshayw1 commented 1 year ago

@casalsgh Just to update you, I have almost fixed it, Just remaining with final testing, I will make a PR soon.

casalsgh commented 12 months ago

@akshayw1 looking forward the fix; good luck! =)

mushroomgenie commented 9 months ago

@casalsgh I was able to reproduce this issue. I noticed that message.md property is being split unequally, and hence the underscore is being considered as the beginning of an italic text. ОЛ_1.2_2_00_ЭС is being rendered as the image below. The underscores between 1.2 are making it italic, and the entire text is bold. The issue also exists if there aren't any special symbols.

Screenshot 2024-01-05 at 7 26 23 PM
casalsgh commented 9 months ago

@mushroomgenie think you can give it a try on a fix PR?

mushroomgenie commented 9 months ago

@casalsgh I believe the fix is to be made in the @rocket.chat/message-parser package. The message.md is being set by the BeforeSaveMarkdownParser class, which calls parser method from the message-parser package. I am looking into the fix in the fuselage repository.

hugocostadev commented 9 months ago

@casalsgh I believe the fix is to be made in the @rocket.chat/message-parser package. The message.md is being set by the BeforeSaveMarkdownParser class, which calls parser method from the message-parser package. I am looking into the fix in the fuselage repository.

Indeed the fix needs to be done in the message-parser package in the fuselage repo