readsoftware / ReadIssues

This is an issue repository for READ. Intended for issues and feature change request that arise during testing and development.
0 stars 0 forks source link

Edition VE - Switch Token - Split Syllable #109

Open IanMcCrabb opened 7 years ago

IanMcCrabb commented 7 years ago

Where a Token is constituted by a split syllable, switching that Token for a alternative causes a data corruption.

cvc ccvccv attempting to switch ccvccv token for alternative.

stevewh commented 7 years ago

This is a difficult one cvc ccvccv attempting to switch ccvccv token for alternative-- would also include cccvccv as switch since we id switches based on the segment pattern.

Solution is very difficult and needs more discussion.

The solution of trying to match previous token also might work, but has a display issue.

stevewh commented 7 years ago

The solution decided so far is to treat switch select by matching entity at a given level while replacement is for the entire token pairs. Still need to discuss compound components with split syllable.

IanMcCrabb commented 7 years ago

Current manual work around for switch is as follows:

stevewh commented 7 years ago

Cannot do anything on this until we specify. Ian needs to specify a spec session.

IanMcCrabb commented 7 years ago

We do need to set up a time to discuss this. Manual work around was not successful.

Issue as I understand it is that where we have a syllble split across tokens and we are attempting to switch only one of those tokens we have an issue with potential mismatch between between sequence of graphemes included in tokens and the grapheme costitution of the parrallel set of syl based on the same the segment pattern.

Intial approach might be to not permit a switch in these circmustances (come up with a manual work around) and then consdier an enhancemnt at a later stage.

stevewh commented 6 years ago

Solution is to remove scl from switch table calculations and to S at the location of the split (as a prefix or suffix depending is the split is at the beginning of the token or at the end.

IanMcCrabb commented 6 years ago

Will need clarification. I get that you are proposing a solution but really can't understand what it is.

stevewh commented 6 years ago

Solution is to remove scl from switch table calculations and to S at the location of the split (as a prefix or suffix depending is the split is at the beginning of the token or at the end.

IanMcCrabb commented 6 years ago

If I understand what you are proposing then this entails a change to the selection model. When object level is set to Word then where there is a syllable split across tokens contract the selection to that set of syllables before/after the split.

cvcvcvcc vcvcv cvcvcvcc vcvcv cvcvcvcc vcvcv

Issue with this approach is the consequence of not being able to view the properties of the token cvcvcvcc and operate on its annotations. Also concerned about the risk associated with changing the selection model at this stage. An alternative of maintaining the selection model but only offering switch alternatives that map to the syl to the left/right of the split syl seems anti-intuitive.

Another concern is that upon selection of the token to switch to, we would need to put a token break at the syl that is broken across tokens. So we end up with a partial syllable token requiring an additional gesture to merge this with the following token.

cvcvcvcc vcvcv ccvccvccv cc vcvcv

Might require less coding and less risk to not change the selection model but instead upon display of properties to test if the selection encompasses a syl split across tokens and if so then not present the user with the any switch options. This would entail the user first modifying the tokenization and then using the existing switch method.

IanMcCrabb commented 6 years ago

As discussed, test db is the one already provided on Workbench. All access details are in email 13/9.

If you look at the first 2 tokens on both Lines 13 and 15 you'll see differnet reading between Bergaigne and Barth. A differnet reading again on 13 for Goodall and a different reading on 15 for Soutif.

IanMcCrabb commented 6 years ago

Steve, concerned about the solution you outlined yesterday. In the following scenario. If we are to attempt to switch gavat for gavot we should be OK. If however we attempt to switch gavat for gavet we end up with the t grapheme from the ta syllable and the o grapheme from the to syllable.

E1 gavat oha E2 gavot oha E3 gavet aha

IanMcCrabb commented 6 years ago

Solution we have arrived at is that we use the split sylalble ID in switch hashing to preclude from switch option any token that contains a different split syl. In the example below, switchthing from gavat woudl only offer gavot as an option not either of the gavet options.

E1 gavat oha E2 gavot oha E3 gavet aha E4 gavet raha

stevewh commented 6 years ago

Solution of using sclID +S + character position for split syllables ensures that only tokens that use the same syllable will match for switching. Fixed in Build 2

stevewh commented 5 years ago

This works in the current V1RC and needs to be tested.

Note matching on occurs where the split syllable is the same (i.e. the same sclID). Matching for replacement token must have the same start and stop encoding (normally just the start and stop segment ids) and the split syllable is encoded using the sclID and the split offset thus requiring the same syllable split at the same location.

stevewh commented 5 years ago

switch works for tokens that begin or end in the same split syllable (same by id not just spelling).

replacements where the split syllable id is different should be ignored (needs more testing).

Current code has a refresh issue when you adjust the first syllable of a word with a split syllable at the end.

stevewh commented 5 years ago

Fixed refresh.