shoes / shoes3

a tiny graphical app kit for ruby
http://walkabout.mvmanila.com
Other
179 stars 19 forks source link

Performance maintenance release #389

Open IanTrudel opened 6 years ago

IanTrudel commented 6 years ago

Let's talk about an hypothetical 3.3.9 release focusing on performances.

References https://github.com/shoes/shoes3/wiki/Profiling https://en.wikipedia.org/wiki/Skia_Graphics_Engine https://en.wikipedia.org/wiki/Stress_testing_(software) https://developer.gnome.org/gdk3/stable/gdk3-Threads.html

dredknight commented 6 years ago

Could you give an idea what a stress test should look like ? I may come up with something. How do you cache sqlite db in memory?

IanTrudel commented 6 years ago

Could you give an idea what a stress test should look like ? I may come up with something.

A stress test is code that will intentionally abuse the good nature and features of Shoes. Here is a few examples of stress tests:

How do you cache sqlite db in memory?

The current cache is on-disk where, for example, files or data are temporarily saved on disk. The problem with that is in the fact that creating, opening, accessing and closing files are costly operations.

A single SQLite database (cache.db) would be opened at startup and closed when Shoes is closed. Anything to be cached would be inserted into some tables. It is also possible to ask SQLite to keep the database in-memory.

dredknight commented 6 years ago

@BackOrder good. Some methods from the skillwheel can be useful as they comply with those requirements. We can adjust them to a state we need. Here is anexample. Block that makes an image work as a button. This is what happens when you interact with images on the skillwheel. @hovers is the custom popup but it can be removed if not necessary.

def set (img, options={}, &block )
    img.hover { @hovers.show text: options[:text], header: options[:header] , size: 9,  text2: options[:text2], width: options[:width], height: options[:height]; img.scale 1.25, 1.25 }
    img.leave { @hovers.hide; img.scale 0.8, 0.8 }
    img.click { @hovers.hide; block.call if block_given? } 
end

set ( image "pics/misc/s_damage.png", left: 80, top: 11, width: 50 ), text: pane_text[4], width: 500, height: 40

A program that will keep Shoes UI busy to the point user input is kinda ignored.

This happens when Shoes is drawing. Unless you had anything else in mind cycling through a drawing pattern will be good enough.

Display several animations concurrently with fast FPS. (Shoes)

How can this be achieved ? Threading is not something that works flawlessly in Shoes.

IanTrudel commented 6 years ago

This happens when Shoes is drawing. Unless you had something else in mind cycling through a drawing pattern will be good enough.

Many things keep Shoes busy. We just need sure way(s) to reproduce the problem.

Display several animations concurrently with fast FPS. (Shoes)

How can this be achieved ? Threading is not something that works flawlessly in Shoes.

Threading is a problem, RE: Challenge: Ruby uses GIL, GTK is not thread-safe.

Maybe have many arts being animated in one single animate(FPS) do ... end would do. We could consider a large FPS number (60, 120, 1000). We could consider multiple Shoes.apps with their own animate. Trial and error might get you to figure out what would be the best way to stress test this.

path-animation does clearly demonstrate how slow animation is. Now if one would want to write a game in Shoes, let's say a Super Mario World clone, that wouldn't go too well according to path-animation. It might inspire you to come up with a stress test.

dredknight commented 6 years ago

Here are multiple animations at once.

 Shoes.app do

    def anim i
        animate(24) do |frame|
             @counter[i].replace "FRAME #{i+1} #{frame*(i+1)}"  
        end
    end

    @counter = []
    @counter << para("STARTING")
    @counter << para("SECOND", left: 0, top: 30)
    @counter << para("THIRD", left: 0, top: 60)
    i = 0
    3.times do
        anim i
        i+=1
    end
 end

@BackOrder regarding threads and drawings (probably animations too). Some time ago I found a way to cheat Shoes. Not sure how to explain it though. This does not work -> Putting a thread with all things inside would not work as we know because threads interfere with shoes processes for some reason. This works -> Putting a thread with drawing elements in start block will work because the thread launches shoes native process which does the drawing on behalf of the thread.

My second app has 3 tabs. One of the tabs is called "online store" when you click on "Update package list" the app makes an http query downloads the new stuff (if any) and shows them to the user. While the transaction is going "Loading" animation is shown to the user.

Usually when the transaction is undergoing the app is frozen until its completion but in this case the it is threaded so the user can have control over other app functions. Here is a screenshot. image

Here is the code sample responsible for it:

button("Update package list", left: 30, top: 10, width: 360, height: 20) do
    check_dl == 0? nil : (messages 0; next)
    @pack_contain.clear { spinner left: 113, top: 90, start: true, tooltip: "Waiting for something?" }
    Thread.new do
        repo_data = get_url @server_url
        start do
            repo_data.nil? ? ( messages 3 ) : ( File.open('NCF_repository/package_list.txt', "w") { |f| f.write repo_data } )
            main_pack_block_online
        end
    end
end
IanTrudel commented 6 years ago

@BackOrder regarding threads and drawings (probably animations too). Some time ago I found a way to cheat Shoes. Not sure how to explain it though.

Interesting approach. Also, this is not a stress test considering that Shoes do display without any efforts. There is something interesting though. I increased the FPS to one million and it's clear that it's peaking way before. Displaying the actual FPS (Frame Per Second) would tell us what the actual peak is.

NOTE: Shoes internal might be able to provide an FPS counter enabled in debugging mode.

This does not work -> Putting a thread with all things inside would not work as we know because threads interfere with shoes processes for some reason. This works -> Putting a thread with drawing elements in start block will work because the thread launches shoes native process which does the drawing on behalf of the thread.

Threads created in Ruby suffer from GIL (Global Interpreter Lock). It means they are in fact executed one at a time. No matter what you do it won't really work. Ruby team plan to fix this on Ruby 3.0. Might take a while. haha

RE: online store

You are correct. Downloading data usually freeze Shoes UI. It is normally not noticeable for small chunk of data but our tests on larger data (ISO file) demonstrated that it will completely blank the UI until download is finished. Also the thread doesn't do anything useful here (GIL again).

IanTrudel commented 6 years ago

Additional reference: https://en.wikipedia.org/wiki/Stress_testing_(software)

IanTrudel commented 6 years ago

Inspired from your code. The more counters, the lower the actual FPS. The more the FPS, the less it seems to make a difference? Sounds like there is frame dropping. You can profile and enjoy the results.

You should obtain the FPS you initially set up when you lower NUMBER_OF_COUNTERS to, say, 10 or 15. Probably starts to lower significantly about 30 counters (animate) or so.

NUMBER_OF_COUNTERS = 150
FPS = 24

Shoes.app do
   @counter = []
   @text = []
   NUMBER_OF_COUNTERS.times do |n|
      @counter << Time.now
      @text << para
      animate(FPS) do |frames|
         if (0 == (frames % FPS))
            @text[n].text = "FPS #{ FPS / ((frames >= FPS) ? (Time.now - @counter[n]) : 1.0)}\n"
            @counter[n] = Time.now
         end
      end
   end
end
IanTrudel commented 6 years ago

Hmm. Simple things have a crazy amount of to_s and draw calls. Roughly 8 times more calls than buttons.

NUMBER_OF_BUTTONS = 2000

Shoes.app do
   NUMBER_OF_BUTTONS.times do |n|
      button "[#{n}]"
   end
end

image

An empty Shoes app looks like this: image

dredknight commented 6 years ago

Awesome :). this is about 9 to_s per cycle. but why? :/

IanTrudel commented 6 years ago

We will know when 3.3.9 comes. For now we should slowly write stress tests and collect data.

It might be as simple as refreshing the window even when it doesn't need it. Or as complicated as adding widgets cause underlying hidden elements to refresh. Scrolling may also refresh everything even when not visible.

Hours of pleasure guaranteed.

ccoupe commented 6 years ago

GTK is not thread-safe.

Please be careful claiming this as a truth. It's not safe in certain situations (gthreads and its nuanced) - does shoes use those special situations - I don't think we do. Shoes one gtk thread (mainloop) Explain how the samples/simple/download.rb works when calling lib/shoes/download.rb if threading is unsafe. Ruby Threads however do have the GIL locking issue. If you really care about performance, then using newer Rubies is what you want.

ccoupe commented 6 years ago

A single SQLite database (cache.db) would be opened at startup and closed when Shoes is closed. Anything to be cached would be inserted into some tables. It is also possible to ask SQLite to keep the database in-memory.

Please - bench mark the simple sdbm key-value store used for shoes external image caches versus sqlite3, both inserts and fetches. Most images are cached in memory and never get loaded from external cache (and only once). Optimization requires knowledge, not speculation.

Hours of pleasure guaranteed.

Or you could do that.

IanTrudel commented 6 years ago

RE: GTK thread safety

My mistake. I remember now that we simply need to tell GLib when we enter/leave the GTK thread.

If you really care about performance, then using newer Rubies is what you want.

Ruby performance is not the core of the problem and the difference in performance when upgrading from x.x.x to x.x.z is generally small. GIL is however a big problem. Threaded applications are considerably faster in jRuby.

Wouldn't agree that most of the bottleneck is in Shoes? It wasn't built with performance in mind.

Optimization requires knowledge, not speculation.

This issue is exactly where we build the said knowledge but it all start from hypothesis. Based on your feedback, a representative benchmark sdbm/sqlite would be the very thing to do.

ccoupe commented 6 years ago

Threaded applications are considerably faster in jRuby.

You are missing the important thing: Are Shoes 4 applications faster than Shoes 3.3.x? - that's your benchmark - not ruby vs jruby - we run Shoes.. You probably know that java swt uses gtk3 and cocoa just like Shoes 3 so they use cairo and pango too. You would also know they have to translate drawing from Shoes4 -> swt-> cairo/pango and they are fighting off by one errors since the project started. You would also know that Shoes 4 & jRuby is a lot closer to the bleeding edge of Ruby versions than Shoes 3 is.

Benchmark properly with context that matters to you.

dredknight commented 6 years ago

Is there a way to simulate hover without actually hovering with the mouse? It will be good for automation scripts.

IanTrudel commented 6 years ago

I am not saying Shoes 4 is faster (or better) than Shoes 3. I am not suggesting to move to jRuby. I am saying threading in Ruby is useless. Any GUI application needs a way to effectively balance visual components and its tasks.

Benchmark properly with context that matters to you.

I am not expecting you to agree with everything I say. You say sdbm is fine? Alright, we can profile and benchmark the thing and see how it turns out. Maybe it is fine!

Does anyone have ever extensively profile and benchmark Shoes3? If nobody has, then nobody knows exactly and anything we say is speculative. So I did setup this issue as a conversation starter and to investigate in order to get all the answers that we need.

IanTrudel commented 6 years ago

Is there a way to simulate hover without actually hovering with the mouse? It will be good for automation scripts.

@ccoupe has suggested to implement the ability to generate events #383. This would make it possible to programmatically move the mouse amongst other things.

ccoupe commented 6 years ago

I did build and profile Shoes 3 with the -gprof flag once, Might still be an option in the linux rakefiles. For the script I ran, most of the cpu time was inside Ruby and not Shoes/cairo/pango/gtk3 (you need a ruby with debugging info to do that). I also know that graphics performance is hardware constrained - by both the cpu and the gpu that gtk was built to support on that platform - not to mention disk speed if loading things.

I am saying threading in Ruby is useless

Perhaps you could look at samples/simple/download.rb and lib/shoes.download and see it threading is useless. It may not be all you want but it is working. Finding bottlenecks in performance is multi-dimensional - it's not easy to do properly and it's not easy to fix in code unless done properly. Wholesale code changes because you think something is slow and have a better idea is not a proper evaluation remember typhoeus?

dredknight commented 6 years ago

Threading works for me though. My app does not freeze during the http requests when it is threaded.

IanTrudel commented 6 years ago

Threading works for me though. My app does not freeze during the http requests when it is threaded.

@dredknight when a task performed by a thread is ever so small, you won't notice that it actually had to finish before moving to the next thread. You can test by yourself but nothing runs in parallel in Ruby.

RE: gprof flag

We should definitively make it happen again on all platforms.

RE: Ruby vs Shoes & friends

I get your point about time spent in Ruby versus Shoes & friends. Though let's not forget some of the Ruby classes and methods are defined in Shoes/C. Those we can improve on.

For example, one thing is clear is that there are too many draw calls. It's impossible to need that many calls. If we could somehow reduce the calls by, say, 20%, it might make a whole lot of difference to Shoes users.

I also know that graphics performance is hardware constrained - by both the cpu and the gpu that gtk was built to support on that platform - not to mention disk speed if loading things.

What would be your take on Cairo/Pango vs Skia? Skia is developed by Google and used on major web browsers and widely used applications. The C API is fairly similar to Cairo but is also more wholesome and actively developed.

RE: Typhoeus

Shoes directly using CURL in an independent C thread might have been a better solution but more work. To be fair, the tests with Typhoeus on Ruby alone were extremely promising. In the Shoes ecosystem? Not so much but still an improvement over the previous method (including dealing with https, right?) and advanced users get Typhoeus gem included with Shoes.

It should be noted that the difference between tests on Ruby alone versus Shoes should also tell us there is room for improvement in Shoes.

Finding bottlenecks in performance is multi-dimensional - it's not easy to do properly and it's not easy to fix in code unless done properly.

Absolutely true. You might have your own suspicions about what the bottlenecks are. How about you share with us and @dredknight and I work on some stress tests for those?

Listen, reading my initial post again clearly shows that things like caching was worded in a way that does not imply anything more than an investigation, e.g. "review, may consider". Maybe you misunderstood the purpose of this issue but this is really an investigative process. We get the tools we need, write the tests we need and investigate the bottlenecks before anything else happen.

Hopefully it sounds reasonable to you.

dredknight commented 6 years ago

@BackOrder I believe it is not threaded but for some reason there is a visible difference with and without the thread. The code above without the thread.new end row simply does not execute the rotation animation (or at least this is what it look like because the user does not see the spinner).

I am currently finishing a few scripts for the app. After that I will have even more time to dedicate on building performance tools. this is something I am very keen to learn and get better at!

ccoupe commented 6 years ago

I get your point about time spent in Ruby versus Shoes & friends. Though let's not forget some of the Ruby classes and methods are defined in Shoes/C. Those we can improve on.

Improve on one or two of the lines of rb_call_something. That's not the problem.

For example, one thing is clear is that there are too many draw calls. It's impossible to need that many calls.

You should look at time in method , not counts. Windows/X/Cocoa compress multiple draws - have done so almost forever. Fascinating topic but nothing Shoes can or should touch.

What would be your take on Cairo/Pango vs Skia?

C++ ? Not me, I've suffered enough.

Hopefully it sounds reasonable to you.

No problem unless you are asking me to do the coding so you can explore. Low priority for me.

IanTrudel commented 6 years ago

Improve on one or two of the lines of rb_call_something. That's not the problem.

Excellent!

You should look at time in method , not counts. Windows/X/Cocoa compress multiple draws - have done so almost forever. Fascinating topic but nothing Shoes can or should touch.

This is only true for system calls but not for Ruby calls (such as draw).

What would be your take on Cairo/Pango vs Skia?

C++ ? Not me, I've suffered enough.

Haha! It is written in C++ but it has a C API.

image

No problem unless you are asking me to do the coding so you can explore. Low priority for me.

Right. That's why it's suggested for a future release. It's not surprising that it is low priority for you because you also spend most of your time on Shoes/C. We can accept the performances of Ruby because Shoes never meant to be the fastest around the corner but there are few things that need to be addressed, such as UI responsiveness, slow display and animate.

IanTrudel commented 6 years ago

The threshold on my machine seems to be 117, 118 images. Profiled for 5 seconds.

IMAGES = [
   "shoes-icon.png",
   "shoes-icon-blue.png",
   "shoes-icon-federales.png",
   "shoes-icon-red.png"
]

NUMBER_OF_IMAGES = 117

Shoes.app do
   @images = []
   @interpolator = (tmp = (0..50).collect { |n| -n }) + tmp.reverse

   NUMBER_OF_IMAGES.times do
      @images << image("#{DIR}/static/#{IMAGES.sample}")
   end

   @counter = 1
   animate(@fps = 60) do |frame|
      @images.each { |img| img.rotate(@interpolator.first) }
      if ((frame / @fps) == @counter)
         @counter += 1
         @interpolator.push @interpolator.shift
      end
   end
end

image image image

IanTrudel commented 6 years ago

Posting some more code testing the limits of Shoes. Brownian motion. 190 animated ovals is fine, 192 is not. Wondering why path-animation sample is performing so poorly.

NUMBER_OF_SHAPES = 192

Shoes.app do
   @shapes = []

   NUMBER_OF_SHAPES.times do
      fill rgb(rand(255), rand(255), rand(255))
      @shapes << oval(rand(self.width), rand(self.height), rand(100))
   end

   animate(60) do
      @shapes.each do |shape|
         mx = rand > 0.5 ? +1 : -1
         my = rand > 0.5 ? +1 : -1
         shape.move shape.left + mx, shape.top + my
      end
   end
end
IanTrudel commented 6 years ago

@dredknight I created a branch performance on Shoes repo to avoid polluting this thread with too much stuff. Don't be shy to add stuff in Tests/performance.

More about Shoes and branches: https://github.com/shoes/shoes3/wiki/Git,-Github-and-Shoes

@ccoupe branch related instructions are working well. Good job.

IanTrudel commented 6 years ago

Slot manipulations are surprisingly fast on widgets, texts and images. Significant performance decrease when introducing Shoes arts (cairo-based).

ccoupe commented 6 years ago

Be aware that Shoes decides how much time to give to Ruby vs Gtk event handling and it differs on Linux vs Windows. see shoes_app_g_poll() in shoes/native/gtk.c for Linux/BSD and shoes_native_loop() (same file) which happens to be the adjustment I made to dredknight cpu hogging bug. As @BackOrder remembers the Shoes 2 and 3.1 code took 100% of a core doing nothing and no one knew why so I moved Shoes Windows to use Gtk3. The Linux polling is _why's with some comments from me. Mysterious place.

Slot manipulations are surprisingly fast

My head hurts figuring out the statistics for each or the rand()s combinations in that script . You might need to run that for a minute or two. Also I believe animate has a small memory leak that I never found. There is also ruby gc at play.

IanTrudel commented 6 years ago

My head hurts figuring out the statistics for each or the rand()s combinations in that script . You might need to run that for a minute or two. Also I believe animate has a small memory leak that I never found. There is also ruby gc at play.

I am open to alternatives and suggestions. Right now I am trying to figure out the tipping points of Shoes. Things that are abusive enough to cause disruption in Shoes but not enough to have it show a blank window. Later on we can come up with a clean set of stress tests.

Would you prefer caching random values in a YAML file then load up in an array? It's difficult to abuse Shoes without randomness but this approach could work. I could otherwise set a number of iterations for each operation. It is also possible to create animate-less code but it will show a different aspect, namely startup and setting UI. Sometimes Shoes takes some time to show up. That could be interesting.

All the cool things in Shoes go through animate. Eventually we need to fix this. What a coincidence that is issue #1 !

Running slot_manipulation for 5 minutes

image image image

IanTrudel commented 6 years ago

The profiler is missing a bit of features. It would be good to have time the profiler ran. Perhaps even the ability to set a timer to auto-stop the profiler. Also, we should consider a GUI-less profiler (similar to packing app) where we we could write a stress tests main app or something.

ccoupe commented 6 years ago

It's just Ruby and most of that is just display/reporting - hack away. There is a $ shoes -e <script_to_profile> option

ccoupe commented 6 years ago

The terminal view is much more useful for decision making. Be aware, Windows/Ruby doesn't provide cpu-time so you can't make decsions based on that - just clock time for windows.

IanTrudel commented 6 years ago

Thanks for the feedback.

dredknight commented 6 years ago

Hello everyone,

I am not sure how radio button is made but it generates absurd amount of draws. Here is a test code. Initially it was more complicated test (10000 iterations and some paras) but starting time and profiling numbers are through the roof. So I went with just 100.

Shoes.app do

    def set_checkers stuff, place
        stuff.each_with_index do |x, i|
            place.append do
                flow do 
                    radio :item;
                end
            end
        end
    end

    ITEMS = Array.new( 100, 1 )
    desktop = stack left: 20, top: 40, width: 200;
    set_checkers ITEMS, desktop
end

image

While experimenting on how to make things worse for shoes I kind of broke it #391

IanTrudel commented 6 years ago

So far it seems that draw is being call an awful lot of times in just about everything. Flickerfest!

dredknight commented 6 years ago

Btw how does shoes draws things exactly? If it draws them dot by dot it kind of makes sense to call the function that many times.

IanTrudel commented 6 years ago

Btw how does shoes draws things exactly? If it draws them dot by dot it kind of makes sense to call the function that many times.

There are some parts of Shoes that I do not fully understand. @ccoupe knows best.

One thing is for sure: we are pushing Shoes to the limits in this thread and we are going to uncover a lot of defects. :)

dredknight commented 6 years ago

holy moly I found something!!! Why they are so many draws when nothing is drawn 1000 times?

Shoes.app do    
    ITEMS = Array.new( 1000, 1 )
    ITEMS.each_with_index do |x, i|
            flow;
    end
end

image

IanTrudel commented 6 years ago

I'm getting a little over 3000 draw on an empty Shoes.app ran for one minute. It's probably not all that bad to be called so many times but it doesn't hurt to investigate and make sure.

IanTrudel commented 6 years ago

3000 draw in 60 seconds means 50 FPS, which is fair (could be a bit faster) and would explain the number of calls on an empty Shoes.app.

ccoupe commented 6 years ago

Why they are so many draws when nothing is drawn 1000 times?

Each flow has an internal canvas (drawing surface) and it has background to possibly draw. Add the second flow now you have 2 canvas in the default slot so their position has to be computed and draw(n). So, we have done 3 draws already . Add a third flow and we have 3+2+1 total draws. Then 4+3+2+1 = total. You can do the math for your thousand slots.

Counts need context - how fast are those draws (in cpu time and clock time) and the knowledge of what your test script is really testing.

dredknight commented 6 years ago

Can you tell me where I can look for what makes the drawing in Shoes? I want to poke inside. May be some ideas will materialize.