dotnet / wpf

WPF is a .NET Core UI framework for building Windows desktop applications.
MIT License
7.02k stars 1.16k forks source link

Loading, saving and loading RTF leads to exception #7601

Open batzen opened 1 year ago

batzen commented 1 year ago

Description

Loading some RTF in a RichTextBox, saving it and loading it again causes an exception to be thrown.

This happens because the RTF contains a hyperlink which contains an umlaut ("ö" in this example). That umlaut is correctly escaped in the source RTF with "\'f6". Loading the source RTF works fine. Saving the RTF with RichTextBox causes the correctly escaped text to become 'f6. Loading this again causes an exception while trying to construct the NavigationUri, as an unescaped "'" is not valid in an URI.

After some investigation the issue seems to be that: XamlToRtfWriter.WriteInlineChild calls BamlResourceContentUtil.UnescapeString which then simply removes the "\" before the "'", thus producing an invalid URI.

One way to solve this would be to add

else if (documentNode.NavigateUri[i] == "'")
{
    _rtfBuilder.Append("\\'");
}

Reproduction Steps

See attached repro project. UmlautIssue.zip

Expected behavior

No exception should occur and RTF should be rendered.

Actual behavior

Exception is thrown during load (after save from RichTextBox).

Regression?

No response

Known Workarounds

No response

Impact

No response

Configuration

Tested with .NET 4.8 and .NET 6.0.

Other information

No response

elyoh commented 1 year ago

We hit this very issue a number of years ago and agree that the BamlResourceContentUtil.UnescapeString method is the cause.

My understanding is that the escaped value of ö should be \'f6 in RTF. The identified method strips the leading \ returning 'f6, triggering the exception when subsequently loading the document.

The same method also breaks Unicode escapes e.g. the Unicode escape for ö is \u246 in RTF and this method strips leading \ returning u246. Whilst this doesn't cause the exception, it changes the Uri.

Our work around was to ensure that the NavigateUri of all Hyperlink elements in the FlowDocument are replaced with a new instance of the Uri whose OriginalString property is 'percent encoded' before saving to RTF. This way, the offending method cannot mutate the value of the. Not ideal, but no other alternative is available unless this is fixed at source.

The underlying issue is that at the point BamlResourceContentUtil.UnescapeString is called, the value of NavigateUri has already been escaped for RTF. The method then tries to remove XAML escape sequences without accounting for that. The correct procedure would be to remove any XAML escape sequences, before escaping for RTF.

elyoh commented 1 year ago

I was initially confused by this line in original issue text:

then simply removes the "" before the "'"

This should read:

then simply removes the "\" before the "'"

Apparently "\" displays as "" and needs escaping.

batzen commented 1 year ago

Thx for pointing out my formatting/encoding mistake. Encoding is hard 😉