KeyboardKit / KeyboardKitPro

KeyboardKit Pro helps you create custom keyboards for iOS and iPadOS.
https://keyboardkit.com
Other
110 stars 8 forks source link

fullDocumentContext bug #2

Closed maxgrev closed 1 year ago

maxgrev commented 1 year ago

The fullDocumentContext() function always copies the text at least twice.

Here’s a simple example of sending the fullDocumentContext result to the pasteboard and then manually pasting it elsewhere:

Code: Button(action : { Task { let proxy = keyboardContext.textDocumentProxy let result = try? await proxy.fullDocumentContext() await MainActor.run { UIPasteboard.general.string = result?.fullDocumentContext } } }, label: { Image(systemName: "wand.and.stars").frame(width: 32, height: 32, alignment: .center) })

Video: https://user-images.githubusercontent.com/39917898/210093317-94777503-a397-4112-b112-7277a697f79f.mp4

danielsaidi commented 1 year ago

Thanks a lot @maxgrev - I will look into this asap!

danielsaidi commented 1 year ago

Hi @maxgrev

I'm having problems reproducing this issue. See below.

I have a test app in the main repo, that reads the full document context and displays is in the keyboard extension. The result is like this, where you can see the keyboard type change from uppercased to lowercased as the cursor reads the text:

https://user-images.githubusercontent.com/429927/210218996-72009726-c957-4988-975a-10fa70ca2aa9.mov

This also works with a really long text, as you can see in this screen recording:

https://user-images.githubusercontent.com/429927/210219238-1a8f9e19-68cf-4ba1-9320-5a5f1ed14834.mov

I also added the pasteboard integration, which also works:

https://user-images.githubusercontent.com/429927/210219811-6659b5ae-e77a-4b82-bde4-fdc3e35a4d2b.mov

I was first thinking, that since my test app uses a non-binary build of KeyboardKit Pro, perhaps something happens to this feature when using it from a binary build, but I've updated the test app to use the latest version of KeyboardKit Pro and it still works.

I have pushed the test project changes to master. You find it under the "testing" folder. Could you try it from your machine?

maxgrev commented 1 year ago

@danielsaidi, I've tested it on my machine, and the results are somewhat random:

  1. Your test app. The first time it doubled the amount of text; the second time, everything was as expected; the third time, it doubled it again; and the other two times, it worked as it should.

https://user-images.githubusercontent.com/39917898/210255029-159f1cd2-2cff-4396-bb14-9a7e6a1844c0.mp4

  1. Another app. Same test keyboard. I didn't count, but it looks like it 10x-ed the result. The other times I didn't record, but they were also random – sometimes it worked as it should, the other times it doubled/tripled it.

https://user-images.githubusercontent.com/39917898/210255326-54444436-59d4-41a9-aa97-a5d2d75813e5.mp4

danielsaidi commented 1 year ago

@maxgrev I've only managed to reproduce the bug twice after many, many attempts. See video below of how I stress test it:

https://user-images.githubusercontent.com/429927/210322026-0c7fb333-376c-4e98-882e-36e224191bec.mp4

The only thing that is consistent here, is that directly after typing, the read seems to drop the last character the first time it reads the context. I also found that if I double or multi-tap the button, multiple read operations are started, that all are moving the cursor. This causes the read operation to never end:

https://user-images.githubusercontent.com/429927/210322757-e7a77d38-a887-404c-8513-53c457e3709f.mov

So, I'll try to add a single read lock on the operation and see if that can fix the problem.

Our of curiosity, how often would you say that the bug happens for you? As I wrote, I have tried reproducing it for a looooong time and only seen it twice.

danielsaidi commented 1 year ago

I've added a lock so that only one read operation can be in progress at once. This fixes the bug in the last video above, but interestingly enough I still got this after two taps:

image

So the issue is something else. And strangely enough it first reads everything but the last character, then everything, then everything but the last character once more.

I will keep on investigating.

danielsaidi commented 1 year ago

@maxgrev One very strange thing that I've found, is that the original document context is invalid!

For instance, if I start the test app with the following text:

image

then type a T into the center of the text:

image

then press the read button, the T is missing from both the proxy's raw before result, as well as the full before result:

image

This makes no sense, since this is the raw result that the system providers. If I however type a T then move the text cursor back one step then forward one step, the T does end up in the result.

maxgrev commented 1 year ago

@danielsaidi, the bug frequency is very random. I just stress-tested it again, and it doubled the text the very first time, but all subsequent fifteen times were bug-free.

Another curious thing is that I've replaced my previous example code with almost exactly the same one you used in your new test app:

var body: some View {
        VStack(spacing: 0) {
            Button(action : {
                readText()
            }, label: {
                Image(systemName: "wand.and.stars")
            })
            SystemKeyboard()
        }
    }

    func readText() {
        Task {
            let result = try await context
                .textDocumentProxy
                .fullDocumentContext()
                .fullDocumentContext
            updateText(result)
        }
    }

    @MainActor
    func updateText(_ copiedText: String) {
        UIPasteboard.general.string = copiedText
    }
}

And it behaved exactly the same way as before, with the bug occurring every single time:

https://user-images.githubusercontent.com/39917898/210428279-66ef4deb-5589-471c-9fe3-fede159085fd.mp4

So I don't know what, but you must've done something with this new test project to significantly minimize this bug frequency. Now how do I reproduce it? :)

danielsaidi commented 1 year ago

@maxgrev I can confirm that adding an initial cursor movement solves the problem of the latest typed character not showing up in the proxy's documentContextBeforeInput.

But, and this is interesting, here it sometimes leads to the read operation missing the double paragraph and instead adding the first character one more time 🤔

https://user-images.githubusercontent.com/429927/210508509-8dbe910b-e109-4258-9b55-d8674e2d8717.mov

Actually, with this in place, the new bug is very reproducible. I just have to tap on the read button a bunch of times and it happens randomly:

https://user-images.githubusercontent.com/429927/210509555-00ca6471-017e-4ce4-857b-b9e3c101b091.mov

Removing the initial cursor movement fixes this bug, so I'll go without it and try to solve the original problem in another way.

danielsaidi commented 1 year ago

Ok @maxgrev I can reproduce a bunch of strange behaviors now, after removing the initial cursor movement.

First, see what happens when I place the cursor on the empty line, then insert a character:

https://user-images.githubusercontent.com/429927/210510512-4906f585-6f36-496c-9adc-43bbd0f3b07e.mov

Inserting a character then tapping the button doubles the ti:s intry. Tapping the button once again fixes it. Then, removing the character and tapping the button removes the space betweentry and`...and just like before, tapping the button once more fixes it.

However, take a look at this! If I move the text cursor to the end of the first paragraph, I can now reproduce your duplicated text bug:

https://user-images.githubusercontent.com/429927/210511134-29b8d6b8-5b99-40ca-bb4b-790fbc8c8c9a.mov

If I re-add the initial cursor movement, this problem is drastically reduced, although we sometimes still have strange behavior when the cursor is placed on an empty line, after removing a character:

https://user-images.githubusercontent.com/429927/210512104-ab4092a7-d481-4a6e-825e-8d0bb94f4b5f.mov

So, there's something strange with the text that we get from the text document proxy. I guess what happens is that it sometimes returns an old version of the text, then updates it to the correct text as the cursor moves through the text as part of the reading operation. Changing the text then causes the read operation to repeat certain parts.

We should not need the initial cursor movement. I mean, the autocomplete works great without anything like that, so I'm not sure why the proxy returns incorrect results here.

However, with the cursor movement in place, this seems to work a LOT better. Perhaps I should release this as part of 6.8 and adjust it further if you still find it to be unreliable?

danielsaidi commented 1 year ago

I have added more info to the result and adjusted the test app, so that it copies all this information to the pasteboard. This will make it easier to debug and improve this feature even more in the future.

For instance, take a look at this. When the operation succeeds:

image

The result looks like this:

---
original context before:
try and read the whole text in this text document.

---
original context after:
-
---
original context:
try and read the whole text in this text document.

---
full context before:
Use this app's keyboard to try and read the whole text in this text document.

---
full context after:

The standard Lorem Ipsum passage, used since the 1500s
---
full context: Use this app's keyboard to try and read the whole text in this text document.

The standard Lorem Ipsum passage, used since the 1500s
---

However, when the operation fails:

image

The result STILL looks like this:

---
original context before:
try and read the whole text in this text document.

---
original context after:
-
---
original context:
try and read the whole text in this text document.

---
full context before:
UUse this app's keyboard to try and read the whole text in this text document.
---
full context after:

The standard Lorem Ipsum passage, used since the 1500s
---
full context: UUse this app's keyboard to try and read the whole text in this text document.
The standard Lorem Ipsum passage, used since the 1500s
---

The two texts are identical! This led me to consider that perhaps the carefully set configuration delay is too small, which would lead to the proxy not having time to update properly when being read in an async manner.

And sure enough, I increased the delay slightly and the problem went away:

https://user-images.githubusercontent.com/429927/210516596-4999b385-f0c8-4438-a3fe-657e6692980e.mov

Hopefully, this means that the feature will work a lot better from now on 🤞

danielsaidi commented 1 year ago

@maxgrev The new version is available in 6.8 - please give it a try and let me know how it works.

maxgrev commented 1 year ago

@danielsaidi, whoa, that's a lot of testing and iterations, impressive work!

I've updated the test app, stress-tested it, and the results were much better than before. There was a dropped last character now and then, and some extra characters if the text was too long or had symbols in it, but it never 10x-ed the text and doubled it only a couple of times.

Ironically enough, this is what happened when I decided to record one of the tests for you:

https://user-images.githubusercontent.com/39917898/210600416-a9123ced-72a4-4948-acac-884e688657e8.mp4

But again, that was rather an exception.

I'm yet to reproduce the results in my main app—it still multiplies the text input 3-10 times—but it probably has something to do with my functions – there are a few more steps than just copying it to the clipboard.

P.S. I don't know if it's worth opening a new ticket, but notice how the English locale has another language (I want to say Armenian?) in place of "return."

danielsaidi commented 1 year ago

Thanks for letting me know about the English return key! I've pushed a 6.8.1 that fixes it 👍

Regarding these problems, could you try using a configuration with a longer delay just to see if it solves it?

maxgrev commented 1 year ago

@danielsaidi, Great! Updated to 6.8.1.

After hours of testing (with and without longer delays – didn't make much of a difference tbh), here are the results:

Some general observations:

Not sure what to make out of all this, though, but you might find this useful, idk.

danielsaidi commented 1 year ago

Hi again,

Thanks for testing and sorry to hear that the result isn't there just yet.

Although the test app works great on simulator as well as my physical iPhone and iPad, I dug out an old iPad and could reproduce many of the problems you describe. For instance, the old iPad would produce a bunch of errors in the text, ignore emojis and fail to move the text cursor back to the start position:

IMG_D11D727C5CFA-1

I however found that increasing the configuration's sleepSeconds parameter improved the result, at least in the test app.

If I increase the value from the original 0.008 to 0.015, the old iPad reads the text ok, including the emojis, but doesn't move the text cursor back to the original position:

IMG_0064

If I however increase the value to 0.018 the old iPad reads the text ok, including the emojis, and also moves the text cursor back to the original position:

IMG_0063

The thread sleep is needed since the proxy takes some time to access the new text after moving the cursor. It seems like the delay need to be a lot longer on older devices compared to new ones.

What happens if you increase the delay to, say, 0.02:

let result = try await context
    .textDocumentProxy
    .fullDocumentContext(config: .init(sleepSeconds: 0.02))

This is only tested in the test app, without autocomplete or KeyboardKit views. I'll try to get an old iPhone 6s up and running to test more, but would love to hear if increasing the delay works for you.

If so, perhaps it would be nice to have a new set of configuration presets, so that you can provide .fast, .reliable etc. instead of numeric delays?

maxgrev commented 1 year ago

Okay, this is crazy, but (sleepSeconds: 0.02) completely fixes everything on older iPhones; even the cursor goes back to its original place most of the time. Thank you! 🙏
I didn't even know fullDocumentContext had config parameters; I don't think it's in the docs, and I guess I wasn't paying attention to autocomplete since I copied and pasted it most of the time. I will try configuring it to increase the delay based on the device. Any other tips when working with fullDocumentContext that you can share?

danielsaidi commented 1 year ago

@maxgrev SO happy to hear this! If you can stick with a manually set 0.2, I will add additional convenience configs to make it easier to use. If you find an even better value, please let me know and I'll adjust the code.

The doc engine (DocC) currently omits extensions, which is why I'm trying to rewrite most extensions as custom protocols. But in the case of UITextDocumentProxy, it's already a protocol so it can't be extended further. I'm thinking about breaking it up into a separate service that uses the extensions under the hood. I will have it in place in KK 6.9 👍

maxgrev commented 1 year ago

@danielsaidi tested different increments and found that 0.03 works the best across all devices, at least in my case. I guess manual adjustment is the best because it probably varies on a case-by-case basis, and 0.03 might not be the best choice for an app targeting only iOS 16+, for example.

Can't wait for 6.9, then!

Not exactly the same, but still a UITextDocumentProxy-related question: have you found any workarounds for a textDocumentProxy.selectedText truncation? According to Apple's docs, it's supposed to return "currently selected text in the document." But in reality, the selectedText property doesn't match what the user has selected. For example, if the selectedText is longer than three sentences, it'd return only the first and the last sentences.

Also, should I close this issue since it was successfully resolved? (my first GitHub ticket xD)

danielsaidi commented 1 year ago

@maxgrev Thanks for letting me know! I guess it would make sense then, to have some preconfigured values, but at the same time having preconfigured like "fast", "reliable" etc. may be a bit confusing. I will give it some thought, but at least consider upping the default value to make customizations not as necessary.

Regarding the selected text truncation, I don't see a way around it since I don't think the proxy gives you enough information to resolve it in another way. But I'm not sure.

If you feel like this issue is resolved, then feel free to close it :) Any upcoming Pro issues are best tracked in the main repo, since that is where I keep the milestones, annotate with versions and labels etc.

Thank you for your patience and immense help with this issue! 🙏

maxgrev commented 1 year ago

Exactly, "fast/reliable/etc." is a bit confusing. I'd add the config parameters to the docs and let the devs figure out the values they need for their apps. But yes, increasing the initial value might be a good idea!

It's too bad about the proxy limitations, but I get the privacy concerns that are likely behind it.

Thanks a ton for your help! 🙏 I'll stay in the main repo from now on!

danielsaidi commented 1 year ago

Hey @maxgrev

I have bumped the delay to 0.03 in KeyboardKit Pro, and adjusted the configuration documentation to emphasize the need for tweaking these values on a case by case basis.

I have also added a FullDocumentContextReader to provide more documentation regarding these features, although the proxy extensions still exist.

maxgrev commented 1 year ago

Hey @danielsaidi! That's great, love the documentation adjustments! So the delay is 0.03 by default now, and I should get rid of my manual configs, correct?

danielsaidi commented 1 year ago

Yeah, it's 0.03 now so hopefully it will work well from now on 👍

I decided to go with the slowest value that worked well for you, even if it is a bit slow ;) The only thing that bugs me now is that the operation needs to do some things at the beginning and end of the proxy, that significantly increases the time the total read operation takes with this new delay. Let's ping each other if we feel like we want to try optimize this some time from now.

maxgrev commented 1 year ago

@danielsaidi, I've updated to 6.9.1, and I'm not sure if you changed something else, but the bug is back on 7+ with iOS 15.7. Works as usual on newer simulators and 11 Pro. I've tried manually changing it again, and only 0.05 worked for 7+ this time, making it painfully slow on 14 Pro. The overall time without manual adjustments seems to be the same as before, but I might be off here.

Another related question: is there any way to control where the cursor stops after the proxy stops running? For example, if the text is a bit longer and you start at the beginning, the cursor almost always ends up at least a few characters off (or sometimes even at the opposite end of the text).

danielsaidi commented 1 year ago

@maxgrev The only thing that’s changed between 6.9.0 and 6.9.1 is how it keeps track of if any simultaneous reading operation is ongoing. I didn’t touch the implementation at all. Does downgrading to 6.9.0 make it work again?

maxgrev commented 1 year ago

@danielsaidi, I actually updated from 6.8.1 (missed 6.9.0 - was busy with the launch), and downgrading to 6.8.1 makes it work again, yes.

danielsaidi commented 1 year ago

The 6.9 diff is pretty huge, but it shouldn’t have affected these parts, since it was more about preparing for 7.0. So it’s a bit strange.

I’ll check and see if I can find anything. Until then, it’s great if you can stay on 6.8.1.

danielsaidi commented 1 year ago

Hey @maxgrev! I downgraded the test app to 6.8.1 and explicitly inject 0.03 and I still get errors based on where in the document I place the cursor on an older iPad. Really strange.

However, upping the delay to 0.05 on that device does the trick. I will try with the latest version.

danielsaidi commented 1 year ago

0.05 seems to work well with the latest version. Can you check?

maxgrev commented 1 year ago

@danielsaidi, okay, so I've been testing all values from 0.01 to 0.1, and nothing seemed to work properly, but then I noticed strange behavior – it worked fine even with a 0.03 delay when the cursor's start position was at any punctuation mark, but the bug appeared with any delay value if it started at any word. So the issue is in autocomplete somehow. It works fine when I turn it off, even without any manual adjustments. Not sure what to do with this. (this is all for 7 Plus, the newer models work fine even with autocomplete and without any adjustments.)

danielsaidi commented 1 year ago

@maxgrev Great find!

As you say, older, slower devices behave strange when the keyboard view redraws, which is what happens when you add an autocomplete context as environment or observed object, then move the cursor. The cursor movement will trigger an instant autocomplete, which will update the autocomplete context with new suggestions, which in turn will trigger a redraw.

For instance, here is an old iPad running the test app without an autocomplete context:

https://user-images.githubusercontent.com/429927/212460766-a02cb22c-c51f-4aa9-a5b4-23a96c47000c.mp4

Tapping the read button multiple times still read the text fine.

Here is the same iPad after adding an @EnvironmentObject private var autocompleteContext: AutocompleteContext property to the keyboard view:

https://user-images.githubusercontent.com/429927/212460820-472b5ee2-c4c5-456b-a084-c34d748ed215.mp4

As you can see, the view flashes a lot more, since the autocomplete context constantly updates. The app then sometimes (although for me always on the second run) completely fails to read the text correctly.

However, if I run the same app with the same environment object added on an iPhone, it looks like this:

https://user-images.githubusercontent.com/429927/212460903-e2aecfbb-cd29-4d72-b1fd-c256c7be1af2.mp4

The view still flashes a lot more, but the result is always the correct.

I'm not sure why redrawing the view makes this operation fail on older devices, and it's actually even stranger to me that newer devices don't have the same problem.

What you can do for now, with the current version, is to override the view controller's performAutocomplete:

override func performAutocomplete() {
    if textDocumentProxy.isFullDocumentContextReadOperationInProgress { return }
    super.performAutocomplete()
}

I will think of a way to add this to the library. Because of the internal setup and the separation between Pro and non-pro, it's not as straightforward, so I have to give it some thought.

Also, after testing some more, my iPad requires a 0.4 sleep duration to avoid producing incorrect results. You should probably set that as a custom duration now, while I adjust the code and prepare a new version.

danielsaidi commented 1 year ago

@maxgrev I have pushed a 6.9.2 version that hopefully works better for you. It tweaks the delay to 0.04 and adds ways for the library to disable autocomplete while the full document context read operation is in progress, which means that changing cursor position will not trigger autocomplete. It works great for me on my old iPad Air.

Sorry for all this inconvenience! Please try it out and let me know.

maxgrev commented 1 year ago

@danielsaidi, it works perfectly now; thank you so much! I have no idea why I never thought about disabling autocomplete while the proxy runs; such an obvious move in hindsight 😅

I'd really appreciate any pointers for this one:

Another related question: is there any way to control where the cursor should stop after the proxy stops running? For example, if the text is a bit longer and you start at the beginning, the cursor almost always ends up at least a few characters off (or sometimes even at the opposite end of the text).

danielsaidi commented 1 year ago

@maxgrev Great!

Regarding the cursor position, it should end up where it started. At least, that's the idea and it works for me all the time on the old iPad now. Hmmmm...I'll check.

maxgrev commented 1 year ago

@danielsaidi, I guess I keep bringing up the edge cases—sorry about that—but here's an example:

https://user-images.githubusercontent.com/39917898/212552288-6d04881f-1d4f-4287-8768-f2815849528d.mp4

It ends up where it started when it's just a wall of text, but if you start at an empty line followed by symbols, it ends up off by a few characters, no matter the device.

danielsaidi commented 1 year ago

Hey, I missed the notification and didn't see this until now, by pure chance.

I've taken a look and can see strange cursor restorations on the old iPad, but not on the new iPhone 14. Do you experience this on newer devices, or only on the iPhone 7? Does increasing the sleep delay help?

maxgrev commented 1 year ago

@danielsaidi, on all devices, including the simulators, and increasing the sleepInterval doesn't make any difference – the result is always just like in the video I shared earlier.

danielsaidi commented 1 year ago

@maxgrev I couldn't at first understand why my old iPad Air behaved SO crappy compared to my testing after fixing the autocomplete stuff, but it turned out that the test project was still running 9.6.1 🙃

I can however still see some small, random bugs on the old iPad Air, where my new iPad Pro behaves well. However, it seems to always break if I start the text with a newline and place the cursor at the beginning. Removing the new line seems to make it work again.

I will take a look at this!

danielsaidi commented 1 year ago

@maxgrev I have added more stuff to the test app, and it's clear that something goes wrong when the text document starts with new lines, but also when it ends with them.

I have pushed the test app changes if you're interested, but I will look into this and try to have a fix out soon.

danielsaidi commented 1 year ago

@maxgrev There's a new 6.9.4 release that improves the full document context behavior when there are leading and/or trailing newlines in the document. Send me an e-mail if you want to sync a bit about the changes 👍

maxgrev commented 1 year ago

hey @danielsaidi, I stopped getting email notifications from GitHub for some reason :/

I was playing around with the test app, and the only thing I could do was to hardcode the cursor adjustment after the proxy runs, but that wasn't a good way of fixing this, of course.

6.9.4 works beautifully; you did it again! Thank you 🙌 I'll shoot you an email shortly.