Open johnwdubois opened 5 years ago
Words now hide themselves properly, however there is still the issue of alignment. This will be a pretty complex alignment issue to tackle because of the amount of annoying edge cases that exist (i.e. adding punctuation to a Rez-chain and then pressing W, or building a chain where each word is in front of punctuation and then pressing W). Rezonator currently only aligns chains and analyzes stretches at specific moments to reduce Race-to-Infinities, so we may need to include W presses as a time to refresh alignment. While these alignment bugs persist with ghost words, there are no known crashes involved.
With the current SB Corpus data, the best way to address this is when "Kind" is NOT equal to "Word" or "EndNote". (See #558 )
Terry's detailed comment above shows that ghost words create issues for alignment, which still need to be addressed.
The problem
When the user chooses to display only "word" rather than "text" (with "w"), there are some tokens that simply disappear (because they contain no alphabetic characters). However, a little white area remains behind, and they continue to occupy a cell in the visible display, displacing the true words.
These can be considered "ghost words": tokens in the Word table that are not real words (e.g. they're non-alphabetic, etc.); but they may occupy space on the screen, when they shouldn't.
To reproduce
Screenshot In this screenshot, the ghost words (invisible, but they occupy a cell) are highlighted in green (SBC002, lines 1-20):
What is needed
How to implement
isWord
to theWord
grid (orvizWord
grid). This would allow Rezonator to know whether a Token is a true Word (isWord
= 1, the default) or not (isWord
= 0).isWord
. The "words" (tokens) that haveisWord
= 0 will be things like pauses etc.isWord
, described below, that uses the Kind value from the new SBC import, but this will have to wait.)isWord
= 0) as if they were Dead. That is:Future development: How to identify Ghost words using the "Kind" value
Kind
field (column). Real words (as opposed to pauses, breathing, etc.) should haveKind
= "word".Kind
= "word" is relevant because not all items in the Word table are true words.Kind
field in the Word table should come from importing data in the new corpus format.Kind
field in the Word table, as a temporary measure, do the following:Kind
field (in the Word table)Kind
= "word"Kind
= "other"Kind
is NOT equal to "word". This is why it is better to take the values forKind
from theKind
value already specified in the imported corpus data.Alternatives you have considered
Additional context See also #185.