BLKSerene / Wordless

An Integrated Corpus Tool With Multilingual Support for the Study of Language, Literature, and Translation
GNU General Public License v3.0
686 stars 90 forks source link

Dependency Parser: `sentence_display` is escaped to HTML #28

Closed tanloong closed 11 months ago

tanloong commented 1 year ago

Describe the bug Some chars like ' are displayed as HTML in the Sentence column.

To Reproduce Steps to reproduce the behavior:

  1. echo -e "What's that? The sign & means \"and\"." > sample.txt
  2. Go to the Dependency Parser tab
  3. Open sample.txt and click Generate table
  4. See error

Expected behavior Sentences are shown as they actually are.

Screenshots 2023-08-29_11-52

Environment information

Additional context I dug a bit and found that with this line commented, the problem disappears.

2023-08-29_11-51

(The Concordancer and Concordancer Parallel tabs are missing in the screenshots because I disabled them after failing to install some of their dependencies. But the screenshots do belong to Wordless version 3.3.0, though run from source code instead of the compiled release.)

BLKSerene commented 1 year ago

Thanks for reporting the issue.

Text in the columns Left, Node, and Right of Concordancer and Parallel Unit of Parallel Concordancer needs to be escaped as HTML entities since they are to be inserted in HTML and rendered as rich text (different highlighting color for different words). Otherwise, special characters in raw text would be treated as part of HTML syntax.

But in Dependency Parser, no need to escape the text since they are rendered as raw text. I'll fix it in the next release (3.4.0).

BLKSerene commented 11 months ago

Fixed in 3.4.0.