Closed abLoftware closed 4 years ago
Thanks for the bug report. Could you provide the two samples as standalone RTF files which I can open with Microsoft Word or Wordpad? Thanks!
Hi Jon, Thanks for taking a look at it! I’ve attached 2 RTF files, one for JIS and one for UTF8
Note that if you do a save after opening it in WordPad then WordPad Will strip out “\fcharset128\cpg65001” and change the hex to Unicode For the UTF8 file
From: Jon Iles notifications@github.com Sent: Friday, March 20, 2020 9:20 AM To: joniles/rtfparserkit rtfparserkit@noreply.github.com Cc: Andre Boutin ABoutin@loftware.com; Author author@noreply.github.com Subject: Re: [joniles/rtfparserkit] cpg command not superceding fcharset command (#22)
Thanks for the bug report. Could you provide the two samples as standalone RTF files which I can open with Microsoft Word or Wordpad? Thanks!
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/joniles/rtfparserkit/issues/22#issuecomment-601696672, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AIEWFCMEYXI42O4XJDGWSGDRINUQXANCNFSM4LOTGMAA.
H Andre, unfortunately I can't see any attachments - cold you link them directlyto the GitHub issue?
Thanks!
Jon
Hi Jon, I have attached them to the issue in GitHub
Thanks! Andre
From: Jon Iles notifications@github.com Sent: Saturday, March 21, 2020 11:54 AM To: joniles/rtfparserkit rtfparserkit@noreply.github.com Cc: Andre Boutin ABoutin@loftware.com; Author author@noreply.github.com Subject: Re: [joniles/rtfparserkit] cpg command not superceding fcharset command (#22)
H Andre, unfortunately I can't see any attachments - cold you link them directlyto the GitHub issue?
Thanks!
Jon
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/joniles/rtfparserkit/issues/22#issuecomment-602063738, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AIEWFCJCHBJ4RT4ZDRM3JUDRITPK5ANCNFSM4LOTGMAA.
I've had a chance to take a quick look. Before I make any changes to the code, I wanted to validate what Microsoft products made of the sample RTF files you provided.
This is what Wordpad makes of the JIS file and here's what Wordpad makes of the UTF8 file Here's what Word makes of the JIS file and here's what Word makes of the UTF8
Based on these results I'm inclined to think that the UTF8 version of the file isn't correct as it stands. If we can get to the point with the UTF8 file where it renders consistently when opened in a Microsoft product and uses the cpg
command, I can make a stab at getting the parser to work with it appropriately.
I am able to get consistent results with both word and wordpad, Being very careful not to make any changes within either since they will re-write how it is it stored
Note for each of the images below I had to add spaces so I could move the caret Out of the way so that the caret wouldn’t be in image, so I had to be sure not to save The file when closing it, which would then change the rtf as it originally was written In fact even though I SWEAR I did not save the UTF-8 yet it still seems to have been rewritten on me, and I ended up with something similar to your word with utf-8
Here is each file in wordpad/word being super careful that the file is not modified
Wordpad JIS Word JIS
Wordpad UTF-8
Word UTF-8
From: Jon Iles notifications@github.com Sent: Thursday, March 26, 2020 9:52 AM To: joniles/rtfparserkit rtfparserkit@noreply.github.com Cc: Andre Boutin ABoutin@loftware.com; Author author@noreply.github.com Subject: Re: [joniles/rtfparserkit] cpg command not superceding fcharset command (#22)
I've had a chance to take a quick look. Before I make any changes to the code, I wanted to validate what Microsoft products made of the sample RTF files you provided.
This is what Wordpad makes of the JIS file [image]https://user-images.githubusercontent.com/4912864/77653652-eb680980-6f67-11ea-83ac-408fe5925cfd.png and here's what Wordpad makes of the UTF8 file [image]https://user-images.githubusercontent.com/4912864/77653728-0470ba80-6f68-11ea-97e7-9be15e1fb6cd.png Here's what Word makes of the JIS file [image]https://user-images.githubusercontent.com/4912864/77653950-48fc5600-6f68-11ea-822e-cea424349129.png and here's what Word makes of the UTF8 [image]https://user-images.githubusercontent.com/4912864/77654047-6b8e6f00-6f68-11ea-82b3-591933e8bb48.png
Based on these results I'm inclined to think that the UTF8 version of the file isn't correct as it stands. If we can get to the point with the UTF8 file where it renders consistently when opened in a Microsoft product and uses the cpg command, I can make a stab at getting the parser to work with it appropriately.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/joniles/rtfparserkit/issues/22#issuecomment-604442620, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AIEWFCN2XNY42ADOD2HH2MLRJNMWVANCNFSM4LOTGMAA.
Thanks for the reply. Unfortunately emailing responses back to this issue drops any embedded images or files. Can you add the images via the GitHub UI?
Should be updated now
From: Jon Iles notifications@github.com Sent: Thursday, March 26, 2020 10:24 AM To: joniles/rtfparserkit rtfparserkit@noreply.github.com Cc: Andre Boutin ABoutin@loftware.com; Author author@noreply.github.com Subject: Re: [joniles/rtfparserkit] cpg command not superceding fcharset command (#22)
Thanks for the reply. Unfortunately emailing responses back to this issue drops any embedded images or files. Can you add the images via the GitHub UI?
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/joniles/rtfparserkit/issues/22#issuecomment-604459704, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AIEWFCPKAT5FW7J3LRACDDLRJNQOXANCNFSM4LOTGMAA.
Hi! That was interesting, I got different results from Wordpad in Windows 8.1 and Windows 10 . I could see the files both rendering the same with the Windows 10 version. Anyway, I've applied a fix and released a new version - hopefully that'll work for you!
Awesome! Thanks for looking into it!
Andre
From: Jon Iles notifications@github.com Sent: Monday, March 30, 2020 5:39 AM To: joniles/rtfparserkit rtfparserkit@noreply.github.com Cc: Andre Boutin ABoutin@loftware.com; Author author@noreply.github.com Subject: Re: [joniles/rtfparserkit] cpg command not superceding fcharset command (#22)
Hi! That was interesting, I got different results from Wordpad in Windows 8.1 and Windows 10 . I could see the files both rendering the same with the Windows 10 version. Anyway, I've applied a fix and released a new version - hopefully that'll work for you!
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/joniles/rtfparserkit/issues/22#issuecomment-605893558, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AIEWFCJJH235T6YD2VBXICDRKBSEFANCNFSM4LOTGMAA.
Japanese_UTF8.rtf.txt Japanese_JIS.rtf.txt
It looks like fcharset is not being over-ridden when cpg is also provided, From RTF Specification version 1.9.1 pg 20: "If the \cpgN does appear, it supersedes the code page corresponding to the \fcharsetN."
test cases attached TestBug2.java.txt