unoplatform / uno

Open-source platform for building cross-platform native Mobile, Web, Desktop and Embedded apps quickly. Create rich, C#/XAML, single-codebase apps from any IDE. Hot Reload included! 90m+ NuGet Downloads!!
https://platform.uno
Apache License 2.0
8.83k stars 711 forks source link

[WASM] Extremely slow rendering of `ComboBox` and `TextBox` controls in a `Grid` #15387

Open opsidjflksdf opened 8 months ago

opsidjflksdf commented 8 months ago

UnoAppTestLayoutTime.zip

I have a couple of pages in my application and in the Wasm version as soon as I have any combo boxes and/or text boxes in there, the pages load comically slow, like 1 second from a click until the page is displayed. I first thought it might be related to my particular choice of layout, but I narrowed it to just a simple Grid with a couple of controls in it still taking a lot of time to load. I put up a separate small benchmarking app to look more closely. I'm measuring the time taken by a (synchronous) call to Navigate(), to navigate to my page:

_stopwatch.Start();
 rootFrame.Navigate(typeof(MainPage2), args.Arguments);
 TimeSpan timespanDoneNavigate = App._stopwatch.Elapsed;

I also looked at the time until the Grid.Loaded event is called, getting similar results.

I'm building in Release, with <WasmShellMonoRuntimeExecutionMode>InterpreterAndAOT</WasmShellMonoRuntimeExecutionMode> for good measure.

With 5 Textbox and 5 ComboBox controls in the Grid, the page takes 200 ms to load. With 10 Textbox and 10 ComboBox controls it takes 310 ms. With 50 and 50 - it's a whopping 1 second.

Needless to say, anything over 50 ms is a noticeable lag. A third of a second lag becomes annoying, let alone a full second when you actually start putting anything meaningful on your page.

The Windows WinUI app with the same code is instantaneous with even hundreds of controls, and even in Debug build and also even running under the debugger.

I'm attaching the full project in a zip file if anyone would like to try. I tried creating the controls in both XAML and plain C# and got the same results (full code in the zip attached).

Build and run the wasm project, and in Chrome open the Dev tools (F12) and look at the console output that shows the timings.

At the end of the message is the code with how I'm creating the controls in C#. In my tests, a combobox takes about twice as long as a textbox, though even with just textboxes, it's thousands of times more computation than it should be. I looked at the processor and it's all busy-busy during layout.

I tried various options for the alignment, doesn't make a difference.

It doesn't look like an issue with .NET optimization, but rather that there's some extremely intensive computation taking place in the layout. I mean what is that layout code doing? Mining bitcoin? :)

Even though this is in a browser, I can't imagine it being anything other than instantaneous even in Javascript, let alone wasm. For the record, the processor I'm running on is a Xeon W-11955M, with 64 gigs or RAM. I wonder if there's some gotcha that I'm not aware of, something I can tweak in code in order for that extreme computation to somehow not be triggered.

Can someone please tell me that what I see ain't real. I can't afford lags on the order of magnitude of a full second in my application. I don't think anybody can.

Later note: I just tried this on two other machines, with completely different hardware configurations, and the load times that I'm getting for the pages are proportionate to the respective power of each CPU, going by multithreaded CPU benchmarks like Passmark CPU Mark. So this confirms that indeed there is some inefficient intense CPU computation going on, rather than a hangup somewhere.

Thanks so much for your hard work!

private Grid CreateGrid()
{
    Grid grid = new();
    grid.ColumnDefinitions.Add( new ColumnDefinition()
    {
        //Width = new GridLength(1, GridUnitType.Star)
    });
    int rowCount = 20;
    for( int row = 0; row < rowCount; row++ )
    {
        grid.RowDefinitions.Add(new RowDefinition() {
            Height = GridLength.Auto
        });
    }
    for( int row = 0; row < rowCount; row++ )
    {
        FrameworkElement ctrl;
        if( row < rowCount/2 )
        //if( row %2 == 0 )
        {
            ComboBox combo = new();
            combo.Items.Add( "item1" );
            ctrl = combo;
        }
        else
        {
            ctrl = new TextBox(){ Text = "test text" };
        }
        Grid.SetColumn( ctrl, 0 );
        Grid.SetRow( ctrl, row );
        ctrl.HorizontalAlignment = HorizontalAlignment.Stretch;
        ctrl.VerticalAlignment = VerticalAlignment.Top;
        grid.Children.Add(ctrl);
    }
    return grid;
}
Youssef1313 commented 8 months ago

cc @ebariche

opsidjflksdf commented 8 months ago

I forgot to specify - this is in Wasm. As I mentioned, in Windows I don't have the lag (pages load instantaneously), and as far as iOS or Android, I haven't tried with those. Definitely needed to clarify. I updated the title and the text as well.

jeromelaban commented 8 months ago

@opsidjflksdf thanks for the details. Wasm runs by default using the interpreter. To validate the performance as a user would, you'll need to run using AOT or Profiled AOT. This other document about performance contains tips on how to improve performance in general.

It does not mean that aren't anything that we can optimize in ComboBox, like https://github.com/unoplatform/uno/issues/9775. which indicates that the ComboBox items panel is not virtualized.

opsidjflksdf commented 8 months ago

@jeromelaban - thank you for your reply. I used <WasmShellMonoRuntimeExecutionMode>InterpreterAndAOT</WasmShellMonoRuntimeExecutionMode>

The documentation at https://platform.uno/docs/articles/external/uno.wasm.bootstrap/doc/runtime-execution-modes.html#mixed-interpreter-and-aot-mode says: "The possible values are:

Interpreter (the default mode)
InterpreterAndAOT"

A couple lines below, there's this text:

"Mixed Interpreter and AOT Mode

This mode enable AOT compilation for most of the assemblies, with some specific exceptions.

This mode is generally prefered to FullAOT as it allows to load arbitrary assemblies and execute their code through the interpreter."

So this implies there is another mode FullAOT except when I try to use it I get

07:35:59:321 2>C:\Users\Me1\.nuget\packages\uno.wasm.bootstrap\9.0.0-dev.27\build\Uno.Wasm.Bootstrap.targets(209,3): error : System.NotSupportedException: FullAOT mode is not supported by this version of the .NET Runtime

What is the highest level of optimization that can be specified, so I can give that a try?

Regarding ComboBoxes - from what I understand the virtualization has to do with loading the items in the combobox itself (and the performance of opening the drop-down). The combos in my test only have 1 item (and the ones in my real application have maybe 3 to 5). Also from what I understand, even when you do have items in the 10s or hundreds in a combo, virtualization would be applicable when you're loading them from e.g. a database, and for that virtualization allows you to only load them on demand. If they're in memory - correct me if I'm wrong - virtualization should be irrelevant, with the amounts of RAM and processing power we have on client devices today. Anyways, my combos for testing are only 1 item.

Also rendering textboxes, even though about two times as fast as combos, is still orders of magnitude slower than what's reasonable user experience. For 20 items, it's 280 ms when half are combos, and half are textboxes. And with 20 textboxes, and no combo, I get 180ms. All running in Release with <WasmShellMonoRuntimeExecutionMode>InterpreterAndAOT</WasmShellMonoRuntimeExecutionMode>

Thanks for your input!

jeromelaban commented 8 months ago

What is the highest level of optimization that can be specified, so I can give that a try?

The FullAOT Mode is not supported anymore by the runtime, InterpreterAndAOT is the fastest. The jiterpreter sometimes provides better performance, depending on code paths.

If they're in memory - correct me if I'm wrong - virtualization should be irrelevant

Virtualization is important for performance, particularly because creating UI elements is costly as you've found out. Still, in your sample, there's no need for virtualization, so it may not be relevant to this particular discussion. It may also be that Uno is doing some more work that should be done for a initial state of a ComboBox (like creating the popup eagerly).

For 20 items, it's 280 ms when half are combos, and half are textboxes. And with 20 textboxes, and no combo, I get 180ms. All running in Release with

Taking your sample untouched, I have this with WinUI:

DoneNavigate(): timespanDoneNavigate: 62 ms
MainPage2_Loaded(): timespanToLoaded: 225 ms
Idle_Handler(): timespanToIdle: 309 ms

And this with Wasm (chrome 121.0.6167.161):

MainPage2_Loaded(): timespanToLoaded: 365 ms
DoneNavigate(): timespanDoneNavigate: 394 ms
Idle_Handler(): timespanToIdle: 479 ms

The startup time is longer than WinUI, though bear in mind that WebAssembly is still not as fast as native development.

Still, we're always looking at ways to improve performance whether it is by improving the underlying Wasm performance, or by finding hot paths in uno's rendering or layouting subsystems.

Profiling using this option and the browser profiler can be of help to determine what may be taking more time than it should.

opsidjflksdf commented 8 months ago

main.zip

@jeromelaban I did profile it with Chrome profiling and it looked like the bulk of the computation was in FrameworkElement.Measure(). Either that being called way too many times, or itself taking a lot of time on a single call.

Regarding what I said about WinUi being instantaneous - my apologies - I take that back, I must have been looking in the wrong place. I tested again, so what I have is: For 20 controls, combined TextBox and Combox: with WinUi, in Release timespanToLoaded: between 74 ms and 95 ms (did several tests) with Wasm: timespanToLoaded: 280 ms.

So the timings for WinUi and Wasm are not as close to each other as the results you got, but still WinUi isn't as fast as I thought it was. Also with 100 controls, WinUi gets to 242 ms for me, in Release, and that's very significant. So clearly WinUI itself is quite inefficient and thus it's hard to blame Uno for not being exactly snappy.

I'm just worried going forward about how I'll be able to justify lags of a half second to a second to the customer (and that's not accounting for network turnaround/backend computation).

I just did a quick test with Flutter and it renders a grid with 1,000 (one thousand) textboxes in a blink. I'm attaching the project if anyone wants to compare. More than a decade ago I wrote this line of business application in ExtJS with serious layout complexity, running in the browser, and remember having zero lag issues. Running it on the Intranet, it was as snappy as the snappiest desktop application written in C++. I can't picture any Javascript library these days, with today's hardware, not being just as snappy.

I'm seriously stumped to how Uno is going to be able to compete, at least on the web? WinUI has its foothold due to being the MS recommended way for Windows app development, but as far as Uno Wasm goes..? I have quite the codebase that I migrated from UWP to WinUi (with not too much effort, so I'm grateful for that). UWP on desktop was running just fine, never an issue with lags. I was hoping to have an easy to transition to the web via Uno on Wasm. I'd love to keep things in C# and not have to do HTML/Javascript or even Flutter/Dart. I like my Visual Studio just fine. Again I wonder how this works for people, since I've seen it mentioned that there are Uno Wasm applications in production already out there? People just got used to the lag? Users thinking it's the network connection?

I've been following Uno for a long time and have been enthusiastic to learn about it- grateful that you guys are putting in the tremendous work that goes into it. Perhaps it wouldn't be that much effort for someone to look into why something that should be a couple for arithmetic calculations ends up being more computation than a chess program running on today's hardware would need to beat the totality of all human chess champions?

I'd try my hand at it at some point, depending on time available. Thanks again for your hard work guys.

main.zip

jeromelaban commented 8 months ago

Thanks for the update.

I just did a quick test with Flutter and it renders a grid with 1,000 (one thousand) textboxes in a blink. I'm attaching the project if anyone wants to compare.

Comparing with flutter is always difficult, because of the way rendering is done. It's not always (mostly never) done synchronously, which makes measuring off in many cases. Using dartpad, rendering all those textboxes still takes around 600ms. Still, point taken about the way flutter renders, as there are significant structural differences when rendering is done by Uno using the DOM.

I'm just worried going forward about how I'll be able to justify lags of a half second to a second to the customer (and that's not accounting for network turnaround/backend computation).

The best controls are the ones that are not created, not rendered, or the ones that are not re-created. We have a feature in Uno called templates pooling, that we're working on currently to improve the behavior that uses the fact that DataTemplate and ControlTemplate can be reused when discarded, avoiding the costs of creation which is most of the cost of your sample. We'll be having updates soon around this.

I'm seriously stumped to how Uno is going to be able to compete, at least on the web? WinUI has its foothold due to being the MS recommended way for Windows app development, but as far as Uno Wasm goes..?

We have plenty to do with regards to performance, in many different areas and profiling your sample brought up some interesting topics for improvements. For instance, our ComboBox control is eagerly creating a TextBox used only during editable mode, which is not the default. This causes a lot of unnecessary work, which we'll be adjusting soon.

Otherwise, interacting with the DOM is a part we're constantly optimizing, as well as taking a look at the changes in the AOT compiler, to avoid scenarios like these:

image

where the use of generics can quickly fallback to interpreter mode, which is quite slow because of AOT/Interp transitions, but also because of the interpreter itself. Each version of .NET has its own variations in the AOT/interp rules, causing failed optimization in hot code paths.

So to answer your question more directly, there will always be portions of Uno that can be enhanced and we'll continue to iterate to address scenarios that are reported as problematic.

I've been following Uno for a long time and have been enthusiastic to learn about it- grateful that you guys are putting in the tremendous work that goes into it. Perhaps it wouldn't be that much effort for someone to look into why something that should be a couple for arithmetic calculations ends up being more computation than a chess program running on today's hardware would need to beat the totality of all human chess champions?

Comparing hyper-specialized ASIC computations in a controlled environment to a broad set of cache-unfriendly operations running on a four-layered runtimes abstraction is probably not the best :) We're always taking the help that we can find and if you have insights in how .NET AOT compiler, WebAssembly JIT compilers, specialized collections, DOM optimization techniques , or find useless work done in the XAML controls we can use, we'll be glad to accept them to improve Uno!

Thanks again for your hard work guys.

Thanks, on behalf of the whole team!