Open EclipsedSolari opened 1 year ago
I'm willing to take a shot at this issue, but I'd like to discuss the correct way to approach it first.
The current entities in MarkdownSanitizer are represented by simple tokens or by regex. Parsing URLs with regex is an unpleasant problem, and I would probably have to roll my own regex since I'm unsure of licensing (and correctness) implications if I grab one from somewhere else on the internet.
Java does have a URL parser, but its implementation seems like a bad fit for this use case, since I'd have to construct a new URL and potentially catch an exception (MalformedURLException) on every possible text sequence.
I could explore a hybrid approach, with a regex to detect possible leading URLs in a sequence (i.e. in the text sequence https://www.google.com?q=~~test~~ some other text here
, each space indicates the end of a potential URL) and feed those potential URLs to java.net.URL for validation.
I'm open to any other ideas.
I am currently giving this issue a shot. This is somewhat complicated as previously suggested, but as long as the programme detects and URL, replaces it with a placeholder and changes it back after the algorithm, it should work correctly.
I would be able to make a pull request tomorrow.
For the final assertion in that test method, I think you may have made a mistake: the expect result should be the URL with only one layer of "~" wrapped around the word "test" in the url, not two.
General Troubleshooting
Expected Behaviour
Discord treats URLs (almost) as literals, even URLs with Markdown characters in them.
Consider the following Google search:
https://www.google.com/search?q=__test__
Although the string
__test__
is Markdown for underlined text, because it's part of a URL, Discord leaves it as-is.JDA's MarkdownSanitizer does not make the distinction between URLs and normal text, so JDA escapes this URL:
https://www.google.com/search?q=\_\_test\_\_
When trying to send this URL as a Discord message, Discord replaces the backslashes with forward slashes but doesn't apply Markdown rules, including escaping, so the end result is the broken URL
https://www.google.com/search?q=/_/_test/_/_
.I expected JDA's Markdown handler to leave URLs as-is.
Code Example for Reproduction Steps
Code for JDABuilder or DefaultShardManagerBuilder used
Exception or Error