LadybirdBrowser / ladybird

Truly independent web browser
https://ladybird.org
BSD 2-Clause "Simplified" License
21.13k stars 911 forks source link

Inputing Cyrillic characters into text fields fails and sometimes crashes the browser #1815

Open shlyakpavel opened 2 days ago

shlyakpavel commented 2 days ago

To reproduce (on macOS, I did not test other OS):

  1. With latest ladybird, open http://google.com

  2. Type при (you will see that the cursor locator is positioned at the wrong location (should be at the end of the line)

    image
  3. Now press cmd+a (or similar) to select the text (all 3 characters you typed)

  4. Press backspace or delete. Voila, browser has crashed!

VERIFICATION FAILED: !_temporary_result.is_error() at /Users/pavel/Develop/ladybird/Userland/Libraries/LibWeb/Page/EditEventHandler.cpp:50
0   liblagom-ak.0.0.0.dylib             0x0000000100acb344 ak_verification_failed + 216
1   liblagom-web.0.0.0.dylib            0x0000000102a0fbfc Web::EditEventHandler::handle_delete(JS::NonnullGCPtr<Web::DOM::Document>, Web::DOM::Range&) + 1152
2   liblagom-web.0.0.0.dylib            0x0000000102a135d8 Web::EventHandler::handle_keydown(Web::UIEvents::KeyCode, unsigned int, unsigned int) + 652
3   WebContent                          0x000000010035a040 WebContent::ConnectionFromClient::process_next_input_event() + 180
4   liblagom-web.0.0.0.dylib            0x0000000102a56094 AK::Function<void ()>::CallableWrapper<Web::Platform::TimerSerenity::TimerSerenity()::$_0>::call() + 88
5   liblagom-core.0.0.0.dylib           0x00000001006f1388 AK::Function<void ()>::operator()() const + 76
6   liblagom-core.0.0.0.dylib           0x00000001006f6450 Core::EventReceiver::dispatch_event(Core::Event&, Core::EventReceiver*) + 112
7   liblagom-core.0.0.0.dylib           0x000000010070606c Core::ThreadEventQueue::process() + 452
8   liblagom-core.0.0.0.dylib           0x00000001006f02ac Core::EventLoopImplementationUnix::exec() + 44
9   liblagom-core.0.0.0.dylib           0x00000001006eeafc Core::EventLoop::exec() + 72
10  WebContent                          0x0000000100356330 serenity_main(Main::Arguments) + 3808
11  WebContent                          0x00000001003e9700 main + 216
12  dyld                                0x0000000181300274 start + 2840

P. S. If you cannot type Cyrillic character, feel free to copy при from this issue. It fails the same way

shlyakpavel commented 2 days ago

This html has the same behavior as google.com

<!DOCTYPE html>
<body>
    <form>
        <input type="text" id="inputField" name="inputField" required>
    </form>
</body>
</html>

I suppose any input field is affected

P. S. Also, any 3 Cyrillic character sequence fails the same way, при is just one example for those who are not familiar with it

rmg-x commented 2 days ago

The underlying error message is: String::from_utf8: Input was not valid UTF-8

I can also reproduce this, and I think it's something to do with bytes vs. characters. Calling end_offset() returns three, but there are six bytes. Then, we substring_view() those with the given offsets which produces an invalid string.