DistributedProofreaders / guiguts

Perl/Tk text editor designed for editing and formatting public domain material for inclusion at Project Gutenberg
GNU General Public License v2.0
9 stars 10 forks source link

Bug or need a way to use the error messages shown below #1266

Closed charliehoward4dp closed 1 year ago

charliehoward4dp commented 1 year ago

Ran "Add GWL" in Spell Query and got the following list of messages:

11:02:05: Guiguts 1.6.1-test2: If you have any problems, please include any error messages that appear here with your bug report.
16:18:10: utf8 "\xE9" does not map to Unicode at D:/Documents/DistrProof/PP/GG 1.6.1-test2/lib/Guiguts/SpellCheck.pm line 500, <$fh> line 21.
16:18:11: utf8 "\xE9" does not map to Unicode at D:/Documents/DistrProof/PP/GG 1.6.1-test2/lib/Guiguts/SpellCheck.pm line 500, <$fh> line 22.
16:18:11: utf8 "\xE9" does not map to Unicode at D:/Documents/DistrProof/PP/GG 1.6.1-test2/lib/Guiguts/SpellCheck.pm line 500, <$fh> line 39.
16:18:11: utf8 "\xE8" does not map to Unicode at D:/Documents/DistrProof/PP/GG 1.6.1-test2/lib/Guiguts/SpellCheck.pm line 500, <$fh> line 43.
16:18:11: utf8 "\xE8" does not map to Unicode at D:/Documents/DistrProof/PP/GG 1.6.1-test2/lib/Guiguts/SpellCheck.pm line 500, <$fh> line 60.
16:18:11: utf8 "\xE9" does not map to Unicode at D:/Documents/DistrProof/PP/GG 1.6.1-test2/lib/Guiguts/SpellCheck.pm line 500, <$fh> line 63.
16:18:11: utf8 "\xE6" does not map to Unicode at D:/Documents/DistrProof/PP/GG 1.6.1-test2/lib/Guiguts/SpellCheck.pm line 500, <$fh> line 124.
16:18:11: utf8 "\xE9" does not map to Unicode at D:/Documents/DistrProof/PP/GG 1.6.1-test2/lib/Guiguts/SpellCheck.pm line 500, <$fh> line 125.
16:18:11: utf8 "\xE9" does not map to Unicode at D:/Documents/DistrProof/PP/GG 1.6.1-test2/lib/Guiguts/SpellCheck.pm line 500, <$fh> line 126.
16:18:11: utf8 "\xFC" does not map to Unicode at D:/Documents/DistrProof/PP/GG 1.6.1-test2/lib/Guiguts/SpellCheck.pm line 500, <$fh> line 129.
16:18:11: utf8 "\xFC" does not map to Unicode at D:/Documents/DistrProof/PP/GG 1.6.1-test2/lib/Guiguts/SpellCheck.pm line 500, <$fh> line 144.
16:18:11: utf8 "\xEA" does not map to Unicode at D:/Documents/DistrProof/PP/GG 1.6.1-test2/lib/Guiguts/SpellCheck.pm line 500, <$fh> line 168.
16:18:11: utf8 "\xEE" does not map to Unicode at D:/Documents/DistrProof/PP/GG 1.6.1-test2/lib/Guiguts/SpellCheck.pm line 500, <$fh> line 170.
16:18:12: utf8 "\xF6" does not map to Unicode at D:/Documents/DistrProof/PP/GG 1.6.1-test2/lib/Guiguts/SpellCheck.pm line 500, <$fh> line 171.
16:18:12: utf8 "\xE9" does not map to Unicode at D:/Documents/DistrProof/PP/GG 1.6.1-test2/lib/Guiguts/SpellCheck.pm line 500, <$fh> line 175.
16:18:12: utf8 "\xE0" does not map to Unicode at D:/Documents/DistrProof/PP/GG 1.6.1-test2/lib/Guiguts/SpellCheck.pm line 500, <$fh> line 208.
16:18:12: utf8 "\xE8" does not map to Unicode at D:/Documents/DistrProof/PP/GG 1.6.1-test2/lib/Guiguts/SpellCheck.pm line 500, <$fh> line 214.
16:18:12: utf8 "\xE8" does not map to Unicode at D:/Documents/DistrProof/PP/GG 1.6.1-test2/lib/Guiguts/SpellCheck.pm line 500, <$fh> line 217.
16:18:12: utf8 "\xEE" does not map to Unicode at D:/Documents/DistrProof/PP/GG 1.6.1-test2/lib/Guiguts/SpellCheck.pm line 500, <$fh> line 227.
16:18:12: utf8 "\xE9" does not map to Unicode at D:/Documents/DistrProof/PP/GG 1.6.1-test2/lib/Guiguts/SpellCheck.pm line 500, <$fh> line 241.
16:18:12: utf8 "\xE6" does not map to Unicode at D:/Documents/DistrProof/PP/GG 1.6.1-test2/lib/Guiguts/SpellCheck.pm line 500, <$fh> line 256.
16:18:12: utf8 "\xE9" does not map to Unicode at D:/Documents/DistrProof/PP/GG 1.6.1-test2/lib/Guiguts/SpellCheck.pm line 500, <$fh> line 270.
16:18:12: utf8 "\xFB" does not map to Unicode at D:/Documents/DistrProof/PP/GG 1.6.1-test2/lib/Guiguts/SpellCheck.pm line 500, <$fh> line 284.
16:18:12: utf8 "\xFC" does not map to Unicode at D:/Documents/DistrProof/PP/GG 1.6.1-test2/lib/Guiguts/SpellCheck.pm line 500, <$fh> line 285.
16:18:12: utf8 "\xFC" does not map to Unicode at D:/Documents/DistrProof/PP/GG 1.6.1-test2/lib/Guiguts/SpellCheck.pm line 500, <$fh> line 286.
16:18:12: utf8 "\xEF" does not map to Unicode at D:/Documents/DistrProof/PP/GG 1.6.1-test2/lib/Guiguts/SpellCheck.pm line 500, <$fh> line 288.
16:18:13: utf8 "\xFC" does not map to Unicode at D:/Documents/DistrProof/PP/GG 1.6.1-test2/lib/Guiguts/SpellCheck.pm line 500, <$fh> line 321.
16:18:13: utf8 "\xE9" does not map to Unicode at D:/Documents/DistrProof/PP/GG 1.6.1-test2/lib/Guiguts/SpellCheck.pm line 500, <$fh> line 331.
16:18:13: utf8 "\xE4" does not map to Unicode at D:/Documents/DistrProof/PP/GG 1.6.1-test2/lib/Guiguts/SpellCheck.pm line 500, <$fh> line 353.
16:18:13: utf8 "\xF4" does not map to Unicode at D:/Documents/DistrProof/PP/GG 1.6.1-test2/lib/Guiguts/SpellCheck.pm line 500, <$fh> line 378.
16:18:13: utf8 "\xE9" does not map to Unicode at D:/Documents/DistrProof/PP/GG 1.6.1-test2/lib/Guiguts/SpellCheck.pm line 500, <$fh> line 379.
16:18:13: utf8 "\xE9" does not map to Unicode at D:/Documents/DistrProof/PP/GG 1.6.1-test2/lib/Guiguts/SpellCheck.pm line 500, <$fh> line 418.
16:18:13: utf8 "\xE9" does not map to Unicode at D:/Documents/DistrProof/PP/GG 1.6.1-test2/lib/Guiguts/SpellCheck.pm line 500, <$fh> line 423.
16:18:13: utf8 "\xE9" does not map to Unicode at D:/Documents/DistrProof/PP/GG 1.6.1-test2/lib/Guiguts/SpellCheck.pm line 500, <$fh> line 424.
16:18:13: utf8 "\xF3" does not map to Unicode at D:/Documents/DistrProof/PP/GG 1.6.1-test2/lib/Guiguts/SpellCheck.pm line 500, <$fh> line 425.
16:18:13: utf8 "\xE9" does not map to Unicode at D:/Documents/DistrProof/PP/GG 1.6.1-test2/lib/Guiguts/SpellCheck.pm line 500, <$fh> line 450.
16:18:13: utf8 "\xE9" does not map to Unicode at D:/Documents/DistrProof/PP/GG 1.6.1-test2/lib/Guiguts/SpellCheck.pm line 500, <$fh> line 477.
16:18:13: utf8 "\xFC" does not map to Unicode at D:/Documents/DistrProof/PP/GG 1.6.1-test2/lib/Guiguts/SpellCheck.pm line 500, <$fh> line 478.
16:18:13: utf8 "\xE9" does not map to Unicode at D:/Documents/DistrProof/PP/GG 1.6.1-test2/lib/Guiguts/SpellCheck.pm line 500, <$fh> line 478.
16:18:14: utf8 "\xE9" does not map to Unicode at D:/Documents/DistrProof/PP/GG 1.6.1-test2/lib/Guiguts/SpellCheck.pm line 500, <$fh> line 483.
16:18:14: utf8 "\xE9" does not map to Unicode at D:/Documents/DistrProof/PP/GG 1.6.1-test2/lib/Guiguts/SpellCheck.pm line 500, <$fh> line 483.
16:18:14: utf8 "\xE4" does not map to Unicode at D:/Documents/DistrProof/PP/GG 1.6.1-test2/lib/Guiguts/SpellCheck.pm line 500, <$fh> line 489.
16:18:14: utf8 "\xE6" does not map to Unicode at D:/Documents/DistrProof/PP/GG 1.6.1-test2/lib/Guiguts/SpellCheck.pm line 500, <$fh> line 500.
16:18:14: utf8 "\xC9" does not map to Unicode at D:/Documents/DistrProof/PP/GG 1.6.1-test2/lib/Guiguts/SpellCheck.pm line 500, <$fh> line 501.
16:18:14: utf8 "\xE9" does not map to Unicode at D:/Documents/DistrProof/PP/GG 1.6.1-test2/lib/Guiguts/SpellCheck.pm line 500, <$fh> line 502.

The listed characters, e.g., \xE9, seem to be valid ones. That's an "é". I don't know what, if anything, to do with these messages.  Files available if needed.
charliehoward4dp commented 1 year ago

There are several Greek phrases in this document, but none of the characters in that list of messages is Greek. After the document went through the HTML AutoGenerator, the Nu Validator flagged some of the Greek as unnormalized, so I normalized both the .txt and .html versions. That satisfied the Validator, but rerunning Spell Query against the normalized text and adding the GWL still produced the same list of "does not map" characters. I thought the GWL itself might be unnormalized, but that doesn't seem to be the case.

srjfoo commented 1 year ago

Those appear to be common precomposed accented characters that are on the Unicode Latin-1 Supplement block.

charliehoward4dp commented 1 year ago

What, if anything, should I as a user do about it? I've normalized the text, and those accented characters are not using diacriticals.

srjfoo commented 1 year ago

I suspect we need Nigel to take a look. To be honest, I didn't think to test adding accented characters when I did the initial testing on that change, so, it's possible that it's a problem that was always there.

Can you look back at 1.6.0 to see if it works? If you don't still have it, I can check if you provide me with test files.

charliehoward4dp commented 1 year ago

It works properly in 1.6.0. (I have everything going back at least to 1.0.24, and even earlier as uninstalled downloads).

Walt isn't the only pack-rat here.

srjfoo commented 1 year ago

🤣 -- I don't think I have that far back. At any rate, I think @windymilla will be able to come up with a solution. We probably just didn't test that new feature as completely as we should have. 😞

charliehoward4dp commented 1 year ago

Test file: download the zipped text from "Pragmatism and Idealism" at DP, open the main text file ("projectID5793bfb9926ab.txt") in GG, start Spell Query, and add from the GWL.

srjfoo commented 1 year ago

Thanks -- I got the same error log using the latest master.