PolicyStat / ckeditor-spell-check-plugin-js-dev-edge

User interface enhancements and beta features for http://ckeditor-spellcheck.nanospell.com . This repo is for nanospell developers and trusted partners to modify and fork plugin.js
Other
0 stars 0 forks source link

Partial words being marked as typos #29

Open winhamwr opened 8 years ago

winhamwr commented 8 years ago

I did some editing today and noticed that on at least two occasions, spellcheck marked a word as a typo because it chopped my word. I'm working on the exact reproduction steps, but it was while I was just typing up new content and using my keyboard to navigate around and edit it.

Example 1

<p><a data-cke-saved-href="https://en.wikipedia.org/wiki/Technical_debt" href="https://en.wikipedia.org/wiki/Technical_debt">Technical Debt</a>&nbsp;is a metaphor used to describe areas where we've consciously&nbsp;chosen to trade off the short term versus long term by building something <span class="nanospell-typo"></span><span class="nanospell-typo"><span class="nanospell-typo">qui</span>ckly</span> instead of in the best way. Just like using a credit card to pay for <span class="nanospell-typo"></span><span class="nanospell-typo"><span class="nanospell-typo">thi</span>ngs</span> in our personal life, it can be a smart strategy as long as we keep it under control.</p>
<p>The flex engineer helps to <span class="nanospell-typo">kee</span>p this debt under control by taking projects to improve the development process, upgrade libraries, refactor code, and optimize performance.</p>

"keep" was split into "kee" and "p", but "p" wasn't marked as a typo.

winhamwr commented 8 years ago

I was able to reproduce this by just typing at my normal speed. No navigation necessary. In the paragraph I just typed, it happened with 3 different words: "squashing", "will", and "improvements"

The typo mark appeared while I was still in the middle of typing each word, so it feels like some sort of timing issue. Maybe if a spellcheck request/response goes fast enough that I didn't have enough time to add additional letters to the partial word, this happens (since adding letters would have changed the word, confusing the code that tries to mark it as a typo)? With the typo for "will", I definitely paused between typing the "wil" and the "l" because I was considering an alternative wording.

<p>Twice per month, the Flex Engineer will dedicate an entire day to <span class="nanospell-typo">squ</span>ashing usability issues. Our User Experience Design Lead, Jana, <span class="nanospell-typo">wil</span>l determine the priority and the focus will be on high-impact, low-effort <span class="nanospell-typo"><span class="nanospell-typo"></span><span class="nanospell-typo">improv</span>ements</span>.&nbsp;</p>

"improvements" started out by just marking "improv" as a typo (which is where I was at the time). I finished the word, typed some other stuff that I deleted, and then backspaced into the word to add a period. That seemed to trigger the remaining "ements" section being marked as a typo.

winhamwr commented 8 years ago

This was all using Chrome. I can try to reproduce by using another browser to editor next time, if that will help. Just let me know which browser to try.

caffodian commented 8 years ago

Hmm. I don't think we need other browsers, though I will need to look into this bug. Spans are supposed to be removed on the current word, but maybe there is an order of events that is wrong here.

caffodian commented 8 years ago

Interesting.

Seems like the span is not being removed on typing if you never leave the word. Probably because the cursor is outside of the span.

If you press enter on a block, it seems to fix this.

caffodian commented 8 years ago

Argh.

It does not "fix" it actually, it marks it correctly but you can't click on the correct thing.

caffodian commented 8 years ago

Possibly a side effect or an unfixed instance of #25 which also nested spellcheck spans.

Actually, the partial marking is the step before you end up into a similar state to what #25 fixed.

caffodian commented 8 years ago

Plan for a repro test:

part 1

part 2

not sure yet, need to figure out why the span is not being ignored. maybe make a safety check where we clear any spans within the current word range, first.

caffodian commented 8 years ago

For case 1, we don't have a good way to pause when typing has begun, but before spellcheck goes. hm.

caffodian commented 8 years ago

I suspect the clearing not working is because it is:

<span class="nanospell-typo">qui</span>^

and not

<span class="nanospell-typo">qui^</span>
caffodian commented 8 years ago

Very strange thing. Now I cannot reliably reproduce this.

caffodian commented 8 years ago

Some of this weirdness has something to do with the cache.

If I get "sho" into the cache as a typo, it becomes impossible to type "shouldn't without the first 3 getting marked as typo, super quickly.

winhamwr commented 8 years ago

The cache just makes it worse, because once you get one bugged typo, it then guarantees that you'll see it for repeated words. Fix the original problem and it seems like the cache will fix itself.

caffodian commented 8 years ago

So here is why it happens:

We changed things in #16 so that the walker just ignores spans, assumption being that if something is already a typo, we don't care.

What this does not account for, is where you add things onto the span afterwards.

Possible fixes

caffodian commented 8 years ago

The test shows this:

caffodian commented 8 years ago

Clearing the spans properly might be the correct thing to do, but it's unclear where and whether all possibly cases can be easily handled.

For example:

<p>qui^</p>

then paste

ly </p><p>some other paragraph</p>

The cursor will end up in the second paragraph and it won't be obvious that you have to go back and fix some previous paragraph tag. There doesn't seem to be a common sense way to do this from the generic change event so we'd probably need to handle all sorts of stuff I haven't thought of yet.

caffodian commented 8 years ago

Quicker fix seems to be : revert the changes to not re-check already marked spans + make it so already marked spans cannot be remarked

caffodian commented 8 years ago

Multi-span thing that causes problems - we shrink the range to only have the text, but we need to clean up the leftover span afterwards.

caffodian commented 8 years ago

Other ugly case

Potentially you could have multiple typos being joined. for ex:

<span>foo</span> ^<span>bar</span>, hit backspace.

caffodian commented 8 years ago

That one is probably a whole other mess though.

caffodian commented 8 years ago

What we should do:

remove spellcheck spans within the word range, but not do anything if the span already wraps the whole word.

Before shrinking the range to mark text, first check if the range covers an existing typo span. (this is tricky, need to confirm if the ranges include existing spans, or end inside them)

If so, do nothing, else clear any ranges within + mark the range.

caffodian commented 8 years ago

notes on Trying the "ignore the current word" approach:

caffodian commented 8 years ago

after having stepped back, it's quite annoying that you seemingly can't convert CKEditor ranges to w3c spec ranges. Because at least according to MDN, https://developer.mozilla.org/en-US/docs/Web/API/Range/compareBoundaryPoints would have been very useful

caffodian commented 8 years ago

Tried moving some of the existing typo skipping back into the wordwalker.

This then led to another degenerate case around range.optimize():

<li><span>typozzz</span></li> will optimize to contain the entire list item, which then means you end up having to find all spans within it and remove them anyway. So this kinda sucks. Back to having the word walker return everything and then fix/ignore existing typos later on.

caffodian commented 8 years ago

various bad approaches later (and a lot of weird edge cases), attempt to filter the fragment before re-inserting.

caffodian commented 8 years ago

That failed, because the fragment doesn't actually contain CKEditor nodes, but we can just as well remove spelllcheck spans from the spellcheck span. Why didn't I think of that earlier.

winhamwr commented 8 years ago

w3c spec ranges

Abandon all hope, ye who enter here. Every browser has a slightly different implementation of this spec. It really sucks.

caffodian commented 8 years ago

Abandon all hope, ye who enter here. Every browser has a slightly different implementation of this spec. It really sucks.

Yea, I spent about 10 mins reading it and noped out.