sergey-tihon / Clippit

Fresh PowerTools for OpenXml
https://sergey-tihon.github.io/Clippit/
MIT License
50 stars 19 forks source link

Handle W.lastRenderedPageBreak in UnicodeMapper #58

Closed luizfbicalho closed 2 years ago

luizfbicalho commented 2 years ago

I did this assembly using the old OpenXmlPowerTools and it worked fine (Except from the > in the other PR), but when I do the same assembly in the clippit I get an error on save

Assert.Fail(): '', hexadecimal value 0x01, is an invalid character.

I created a test to show this error and get some insight on why this is happening.

result.docx data.txt input.docx

sergey-tihon commented 2 years ago

I believe that the issue in DA-XmlError.docx Some bad characters inside

image
luizfbicalho commented 2 years ago

I believe that the issue in DA-XmlError.docx Some bad characters inside

image

I understand that something happened or in the document or in the xml, but the real issue is that this doesn't happen in the old OpenXmlPowerTools

If you get the word part <# #> there is nothing between the öü, and if you get in debugger there is this öü

I think that this is something related to page breaks or line breaks inside the xml of the content, shoudn't It be treated in the line 512 to remove from the regex?

luizfbicalho commented 2 years ago

I Added a Xml Cleaning method

sergey-tihon commented 2 years ago

@luizfbicalho plz check that latest changes here works for you.

luizfbicalho commented 2 years ago

@luizfbicalho plz check that latest changes here works for you.

how do I get this back to my repo?

sergey-tihon commented 2 years ago

Just pull, it is already in your fork

luizfbicalho commented 2 years ago

@luizfbicalho plz check that latest changes here works for you.

Looks like its all working, I'll try to update my libraries later, thanks