SimulaVR / Simula

Linux VR Desktop
MIT License
2.91k stars 87 forks source link

Certain Java apps failing under Simula? #125

Open georgewsinger opened 3 years ago

georgewsinger commented 3 years ago

We have received some reports in our Discord that certain Java apps are failing to launch in Simula:

I have only one important message: do not forget about Java apps. I am a stock market trader and platforms are the most painful experience under Simula. That undermines the whole idea behind VR for me as I am a clear usecase for it (I need many charts and tables around me). So if anyone is interested in success of the project among business users do look how non graphic related apps run under SimulaVR. By non graphic I mean not Gimp or 3DMax but charts plotting software and trading platforms where users do really benefit from unlimited 3D space I mean I really will like to throw out all my 5 screens and just use SimulaVR, but for now I can't. So for now I just use it for day2day activities but still am trading under plan old XFCE.

With that said, eclipse (a Java GUI app) seems to launch OK in Simula:

image

If anyone is aware of which Java apps are failing, let me know and I can start the bug replication process.

hellkaim commented 3 years ago

Hi. And thank you for paying attention to that.

I have tow applications:

  1. Think Or Swim by TD Ameritrade
  2. Trader Workstation by Interactive Brokers (IB)

Both apps works as follows: they start a pre-launcher... code that init updater then the login and crypto engines. User has to authenticate with their credentials (in case of IB it requires SMS or Secure device for additional tokens). So we have at least 3 windows for now. After user has successfully authenticated the new process starts loading configs and all the math\visualisations user has. The main window or a set of windows (depends how the user configured the app) is displayed.

Each window has additional windows inside (treat them as a browser tabs) that could be detached from the main window).

Both apps crashes during startup even before login screen. I once had success passing after login screen but that was it.

Technical data: 5.4.0-70-generic #78-Ubuntu SMP openjdk version "1.8.0_252" OpenJDK Runtime Environment (Zulu 8.46.0.19-CA-linux64) (build 1.8.0_252-b14) OpenJDK 64-Bit Server VM (Zulu 8.46.0.19-CA-linux64) (build 25.252-b14, mixed mode) XFCE 4.14 (1.43.2-1 amd64)

P.S. If you can tell me where to find SimulaVR version I will add it here. For TOS credentials are required. I can speak with support for them to issue a demo credentials if needed. TWS has a demo log-in that require a brief registration so it can be started without a support request.

georgewsinger commented 3 years ago

@hellkaim We're going to get this working.

Since Trader Workstation doesn't require demo credentials, I'll work with it for now. So I have installed ib-tws from here via nix:

cd ~/Downloads

# Download some Trader Workstation assets for nix
curl https://download2.interactivebrokers.com/download/unixmacosx_latest.jar --output ibtws_9542.jar
nix-prefetch-url file://$PWD/ibtws_9542.jar
curl https://enos.itcollege.ee/~jpoial/allalaadimised/jdk8/jdk-8u281-linux-x64.tar.gz --output jdk-8u281-linux-x64.tar.gz
nix-store --add-fixed sha256 jdk-8u281-linux-x64.tar.gz

# Install ib-tws
nix-env -iA nixpkgs.ib-tws

When I launch it outside of Simula via

$ ib-tws

I get

ERROR: "" is not a valid name of a profile.

Inspecting the source code for the nix expression, it looks like I can adjust the environment variables IB_USER_PROFILE, IB_USER_PROFILE_TITLE, and IB_PROFILE_DIR to make different things happen. Do you know what values I might need to set these to get this program to launch outside of Simula? Once I get ib-tws working outside of Simula, I can begin the debugging process inside of Simula. :mag:

hellkaim commented 3 years ago

I have a following launch command for TWS: "/home/%USER/Jts/tws" -J-DjtsConfigDir="/home/%USER/Jts" %U

I do not have any IB related Profile settings - it just runs. But I do use tar.gz file from them - not a NIX variant (again I am running Ubuntu 20.04).

More over, grepping it over a home dir shows no results. I think that both should be pointed to the directory where TOS is installed, i.e. /home/%USER/Jtsin my case. What I can see from the mentioned nix expression (see line 60 ) it seems that JWS will have to have a local user dir (treat it as a .config dir where all Linux apps store there configurations) which should contain some ini files and other stuff that is user specific. So yes, creating a directory under your user home folder and setting IB_PROFILE_DIR to it should solve the issue.

georgewsinger commented 3 years ago

@hellkaim

By launching the nix version of tws with

ib-tws ~/nixpkgs/pkgs/applications/ib

I was able to see the launcher spawn in Simula:

image

Is this the expected behavior? Have you ever seen this window spawn in Simula before with the Ubuntu 20.04 version?

hellkaim commented 3 years ago

This seems to be right. My app seems to be different from yours - see below. Again - I use a tar.gz variant untared localy to /home/%USER/opt folder.

Startup screen Login screen Advanced options

So what you see is what should it look like. Last time I checked there was a Java error (some exception I can't recall now). The issue is that I do not have access to SimulaVR right now and can't test it.

georgewsinger commented 3 years ago

Steps to replicate. @hellkaim I was able to replicate your crash as follows:

  1. Register a username/pw on IB's website.

  2. Download ./tws-applicant-linux-x64.sh from IB' website (provided only after you register for a free trial) and run via

    chmod u+x tws-applicant-linux-x64.sh 
    ./tws-applicant-linux-x64.sh 
  3. Launch the tws application from within Simula via:

    "/home/$USER/Jts/tws" -J-DjtsConfigDir="/home/$USER/Jts" $USER
  4. Receive the following crash message:

/nix/store/9df65igwjmf2wbw0gbrrgair6piqjgmi-glibc-2.31/lib/libc.so.6(+0x380f0) [0x7f6b5f3850f0] (??:0)
./submodules/godot/bin/godot.x11.tools.64(xwm_atoms_contains+0x4d) [0x16e8d26] (/home/george/SimulaMainFresh3/submodules/godot/modules/gdwlroots/wlr_xwayland.cpp:37)
WlrXWaylandSurface::handle_map(wl_listener*, void*) (/home/george/SimulaMainFresh3/submodules/godot/modules/gdwlroots/wlr_xwayland.cpp:192)
/nix/store/h2viqkgf1nmb7l916gbpra67xzv0ra2k-wlroots/lib/libwlroots.so.0(+0x6c47c) [0x7f6b5f97647c] (??:0)
/nix/store/h2viqkgf1nmb7l916gbpra67xzv0ra2k-wlroots/lib/libwlroots.so.0(+0x2ba84) [0x7f6b5f935a84] (??:0)
/nix/store/h2viqkgf1nmb7l916gbpra67xzv0ra2k-wlroots/lib/libwlroots.so.0(+0x6688a) [0x7f6b5f97088a] (??:0)
/nix/store/h2viqkgf1nmb7l916gbpra67xzv0ra2k-wlroots/lib/libwlroots.so.0(+0x66b88) [0x7f6b5f970b88] (??:0)
/nix/store/s0mblhs5vmjza9dmipn74rwqflxy1fw7-libffi-3.3/lib/libffi.so.7(+0x7abd) [0x7f6b5eebcabd] (??:0)
/nix/store/s0mblhs5vmjza9dmipn74rwqflxy1fw7-libffi-3.3/lib/libffi.so.7(+0x679c) [0x7f6b5eebb79c] (??:0)
/nix/store/102qzi66ynry8cqzwg6y9rpjh9kfl9ip-wayland-1.18.0/lib/libwayland-server.so.0(+0xd370) [0x7f6b5f8fe370] (??:0)
/nix/store/102qzi66ynry8cqzwg6y9rpjh9kfl9ip-wayland-1.18.0/lib/libwayland-server.so.0(+0x97f2) [0x7f6b5f8fa7f2] (??:0)
/nix/store/102qzi66ynry8cqzwg6y9rpjh9kfl9ip-wayland-1.18.0/lib/libwayland-server.so.0(wl_event_loop_dispatch+0xc2) [0x7f6b5f8fc402] (??:0)
WaylandDisplay::_notification(int) (/home/george/SimulaMainFresh3/submodules/godot/modules/gdwlroots/wayland_display.cpp:48)
WaylandDisplay::_notificationv(int, bool) (/home/george/SimulaMainFresh3/submodules/godot/modules/gdwlroots/wayland_display.h:7 (discriminator 14))
Object::notification(int, bool) (/home/george/SimulaMainFresh3/submodules/godot/core/object.cpp:933)
SceneTree::_notify_group_pause(StringName const&, int) (/home/george/SimulaMainFresh3/submodules/godot/scene/main/scene_tree.cpp:987)
SceneTree::idle(float) (/home/george/SimulaMainFresh3/submodules/godot/scene/main/scene_tree.cpp:527 (discriminator 3))
Main::iteration() (/home/george/SimulaMainFresh3/submodules/godot/main/main.cpp:2028)
OS_X11::run() (/home/george/SimulaMainFresh3/submodules/godot/platform/x11/os_x11.cpp:3264)
./submodules/godot/bin/godot.x11.tools.64(main+0xfa) [0x14e7c1c] (/home/george/SimulaMainFresh3/submodules/godot/platform/x11/godot_x11.cpp:57)
/nix/store/9df65igwjmf2wbw0gbrrgair6piqjgmi-glibc-2.31/lib/libc.so.6(__libc_start_main+0xed) [0x7f6b5f370c7d] (??:0)
./submodules/godot/bin/godot.x11.tools.64(_start+0x2a) [0x14e7a7a] (/build/glibc-2.31/csu/../sysdeps/x86_64/start.S:122)
-- END OF BACKTRACE --

@KaneTW I made an rr trace of the bug but am having trouble uploading it to Pernosco. Here are the relevant crash points from wlr_xwayland.cpp:

bool xwm_atoms_contains(struct wlr_xwm *xwm, xcb_atom_t *atoms,
                                                size_t num_atoms, enum atom_name needle) {
    xcb_atom_t atom = xwm->atoms[needle];

    for (size_t i = 0; i < num_atoms; ++i) {
        if (atom == atoms[i]) { //<- Here
            return true;
        }
    }

    return false;
}

and

void WlrXWaylandSurface::handle_map(struct wl_listener *listener, void *data) {
    //std::cout << "WlrXWaylandSurface::handle_map(..)" << std::endl;
        WlrXWaylandSurface *xwayland_surface = wl_container_of(
                                                                                                                     listener, xwayland_surface, map);

        bool is_splash_surface = xwm_atoms_contains(xwayland_surface->wlr_xwayland_surface->xwm,
                                                                                                xwayland_surface->wlr_xwayland_surface->window_type,
                                                                                                1,
                                                                                                NET_WM_WINDOW_TYPE_SPLASH); //<-- Here

        bool is_normal_surface = xwm_atoms_contains(xwayland_surface->wlr_xwayland_surface->xwm,
                                                                                                xwayland_surface->wlr_xwayland_surface->window_type,
                                                                                                1,
                                                                                                NET_WM_WINDOW_TYPE_NORMAL);

        if ( is_splash_surface ) {
            xwayland_surface->emit_signal("map", xwayland_surface);
    } else if( xwayland_surface->wlr_xwayland_surface->parent == NULL && (! is_normal_surface) ) {
            xwayland_surface->emit_signal("map_free_child", xwayland_surface);
        } else if( xwayland_surface->wlr_xwayland_surface->parent == NULL && is_normal_surface ) {
            xwayland_surface->emit_signal("map", xwayland_surface);
        } else if( xwayland_surface->wlr_xwayland_surface->parent != NULL && !is_normal_surface ) {
            xwayland_surface->emit_signal("map_child", xwayland_surface);
        } else {
            xwayland_surface->emit_signal("map_free_child", xwayland_surface);
        }
}

...so it looks like it's some sort of splash surface routing bug. Will report back when I have more info on this.

georgewsinger commented 3 years ago

Pernosco trace: https://pernos.co/debug/C1dBhz5X0rBLlZwmC6Vzcg/index.html

georgewsinger commented 3 years ago

@hellkaim We're making progress on this:

image

hellkaim commented 3 years ago

Thanks!

Apr 27, 2021 00:31:09 George Singer @.***>:

@hellkaim[https://github.com/hellkaim] We're making progress on this:

[https://user-images.githubusercontent.com/10677444/116153318-bcf9b800-a6ac-11eb-8ca1-86b6530b5a21.png][image][https://user-images.githubusercontent.com/10677444/116153318-bcf9b800-a6ac-11eb-8ca1-86b6530b5a21.png]

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub[https://github.com/SimulaVR/Simula/issues/125#issuecomment-827159248], or unsubscribe[https://github.com/notifications/unsubscribe-auth/AJTQCS6HGCTTVXG2XGTTPTLTKXLR3ANCNFSM43GKELPA]. [###24x24:true###][Tracking image][https://github.com/notifications/beacon/AJTQCS5FKT3BILG7F4T6IKTTKXLR3A5CNFSM43GKELPKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOGFGXFUA.gif]

georgewsinger commented 3 years ago

Update: So ib-tws is currently launching its splash screen OK, but when I run the demo, it crashes (but only in Simula, not other compositors like rootston).

Updated Pernosco traces: The following traces reflect this pattern.

Pernosco notes:

  1. There are more handle_destroy's in Simula than in rootston (which makes sense, since the process survives in rootston but is prematurely killed in Simula)
  2. The ib-tws xwayland surface receives all of its interesting events from x11_event_handler (wlroots xwayland function), which ultimately bubbles up the handle_destroy calls.
  3. ib-tws process.
    • The ib-tws process is killed by the time one of the handle_destroy calls is made in Simula; in rootston, it is never killed.
    • The ib-tws process spawns an enormous number of threads (before it is ultimately destroyed). See e.g. here.
      • Both rootston and Simula traces show an enormous number of SIGCHLD and SIGSEGV signals being emitted to the java threads
    • Pernosco has a gdb feature that allows you, in theory, to step into specific processes (like the ib-tws process). According to their documentation: "Use any number of gdb sessions. Sessions can be attached to different processes by selecting a process in the 'running processes' or 'all processes' views and then starting a gdb session."
      • I attempted to use this with the ib-tws process, but couldn't think of anything useful to do? We don't have sources to inspect.

It remains completely unclear why ib-tws gets destroyed in Simula after exiting the splash screen. @kanetw Do you have any thoughts on how to tackle this?

georgewsinger commented 3 years ago

ib-tws opcodes. Here is some data on the ib-tws opcodes

  1. Launching the splash. The following is printed
2021-05-03 17:21:22 - [xwayland/xwm.c:1228] xcb error: op 18:0, code 3, sequence 253, value 4194431
  1. Clicking "return to demo" on the splash screen. No further opcodes printed.

  2. Clicking "Try Demo" (which induces the ib-tws process crash).

_handle_destroy
2021-05-03 17:24:02 - [xwayland/xwm.c:1228] xcb error: op 12:0, code 3, sequence 62804, value 6291583
2021-05-03 17:24:02 - [xwayland/xwm.c:1228] xcb error: op 25:0, code 3, sequence 62805, value 6291583
2021-05-03 17:24:02 - [xwayland/xwm.c:1228] xcb error: op 18:0, code 3, sequence 62806, value 6291583

Other application op-codes. Here is some opcode data from some other familiar apps:

  1. firefox. Only after closing firefox do we get

    _handle_destroy
    2021-05-03 17:17:02 - [xwayland/xwm.c:1228] xcb error: op 25:0, code 3, sequence 13328, value 4194307
    2021-05-03 17:17:02 - [xwayland/xwm.c:1228] xcb error: op 18:0, code 3, sequence 13329, value 4194307
    2021-05-03 17:17:02 - [xwayland/xwm.c:1228] xcb error: op 18:0, code 3, sequence 13330, value 4194307

    before that: nothing.

  2. xfce4-terminal. opcodes aren't printed on launch or closing.

  3. google-chrome-stable. opcodes aren't printed on launch or closing.

  4. gvim. Only after closing gvim do we get:

    _handle_destroy
    2021-05-03 17:20:06 - [xwayland/xwm.c:1228] xcb error: op 12:0, code 3, sequence 2685, value 4194311
    2021-05-03 17:20:06 - [xwayland/xwm.c:1228] xcb error: op 25:0, code 3, sequence 2686, value 4194311
    2021-05-03 17:20:06 - [xwayland/xwm.c:1228] xcb error: op 18:0, code 3, sequence 2687, value 4194311

    which is exactly what we get with ib-tws.

Sway frequently spits out opcode errors too. Here is a post which indicates that op 18.0, code 3 is frequently spammed to users of sway.

georgewsinger commented 3 years ago

signal handling looks OK?

georgewsinger commented 3 years ago

English OpCodes. We used SimulaVR/libxcb-errors to better parse the op code errors:

  1. ib-tws. For documentation, see ChangeProperty, SendEvent, and ConfigureWindow. It's still not clear to me how to parse the "sequence" and "value" parameters.

    2021-05-04 12:30:45 - [xwayland/xwm.c:1220] xcb error: op ChangeProperty (no minor), code Window (no extension), sequence 278, value 4194431 # after first splash
    # ..
    2021-05-04 12:31:07 - [xwayland/xwm.c:1220] xcb error: op ConfigureWindow (no minor), code Window (no extension), sequence 26085, value 6291583 # after/during crash
    2021-05-04 12:31:07 - [xwayland/xwm.c:1220] xcb error: op SendEvent (no minor), code Window (no extension), sequence 26086, value 6291583
    2021-05-04 12:31:07 - [xwayland/xwm.c:1220] xcb error: op ChangeProperty (no minor), code Window (no extension), sequence 26087, value 6291583
  2. gvim. After/during close:

    2021-05-04 12:32:20 - [xwayland/xwm.c:1220] xcb error: op SendEvent (no minor), code Window (no extension), sequence 12521, value 4194311
    2021-05-04 12:32:20 - [xwayland/xwm.c:1220] xcb error: op ChangeProperty (no minor), code Window (no extension), sequence 12522, value 4194311
  3. firefox. After/during close:

    2021-05-04 12:33:12 - [xwayland/xwm.c:1220] xcb error: op SendEvent (no minor), code Window (no extension), sequence 8638, value 4194307
    2021-05-04 12:33:12 - [xwayland/xwm.c:1220] xcb error: op ChangeProperty (no minor), code Window (no extension), sequence 8639, value 4194307
    2021-05-04 12:33:12 - [xwayland/xwm.c:1220] xcb error: op ChangeProperty (no minor), code Window (no extension), sequence 8640, value 4194307
georgewsinger commented 3 years ago

Video documentation of crash. @kanetw Here is what the crash looks like on video.

  1. The normal crash. Here we crash after "Authenticating.." is read from the splash loading screen: https://www.youtube.com/watch?v=p7yFh06aB5M&ab_channel=GeorgeSinger
  2. The "quick" crash. This one happens much less frequently, but does occassionally happen. As you can see, we get a huge screen momentarily for a frame or two, and then it cuts out: https://www.youtube.com/watch?v=A9GmKPTHpE4
  3. What it should look like (in rootston). https://www.youtube.com/watch?v=S2U7XrwkJ8o&ab_channel=GeorgeSinger
georgewsinger commented 3 years ago

@hellkaim Well this was one hell of a bug that only took us 20 days to fix..

We finally figured out the issue, and pushed it to our dev branch, which you can access experimentally via:

git clone --recursive --branch dev --depth 1 https://github.com/SimulaVR/Simula SimulaDev
cd SimulaDev
source ./utils/Helpers.sh && installSimula # Will take a while to build
./result/bin/simula

We haven't fully stressed dev branch, so there could be other bugs lurking as well. I'm planning on stress testing it today and merging with master.

hellkaim commented 3 years ago

Hey, that is awesome!

The issue is I am on a trip and do not have all my VR gear with me. I will do my best to test it as soon as I return.

That was a very, I mean very kind of you guys to fix this. Can't wait to be back and test it.

P.S. Did you try to undock the windows from the main one and create child? Also I may know other issue but let's just stick to a basic stuff.

Thank you againg

May 10, 2021 00:09:18 George Singer @.***>:

@hellkaim[https://github.com/hellkaim] Well this was one hell of a bug that only took us 20 days to fix..

We finally figured out the issue, and pushed it to our dev branch, which you can access experimentally via:

git clone --recursive --branch dev --depth 1 https://github.com/SimulaVR/Simula SimulaDev cd SimulaDev source ./utils/Helpers.sh && installSimula # Will take a while to build ./result/bin/simula

We haven't fully stressed dev branch, so there could be other bugs lurking as well. I'm planning on stress testing it today and merging with master.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub[https://github.com/SimulaVR/Simula/issues/125#issuecomment-835890273], or unsubscribe[https://github.com/notifications/unsubscribe-auth/AJTQCS6QTPSNJC6YGIIT3VTTM32X3ANCNFSM43GKELPA]. [###24x24:true###][Tracking image][https://github.com/notifications/beacon/AJTQCS7YZX2QZBTPEDPPBADTM32X3A5CNFSM43GKELPKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOGHJKYYI.gif]

georgewsinger commented 3 years ago

That was a very, I mean very kind of you guys to fix this. Can't wait to be back and test it.

100% our pleasure. Let us know about any other bugs and we'll fix them too.

I may know other issue but let's just stick to a basic stuff.

Let us know, and we'll fix it.

georgewsinger commented 3 years ago

@hellkaim We had to fix 2 more bugs before merging dev with master. Everything should work now on master. :+1: