JuliaPluto / PlutoUI.jl

https://featured.plutojl.org/basic/plutoui.jl
The Unlicense
299 stars 54 forks source link

High Memory Usage with High Resolution Slider #212

Closed Firionus closed 1 year ago

Firionus commented 2 years ago

When creating a slider with inappropriately high resolution, the memory usage becomes very high, slowing down the computer or even killing the Julia process.

Steps to Reproduce

  1. Open Task Manager, htop, ... and sort processes by RAM usage
  2. Create a new Pluto notebook with the code
    begin
    using PlutoUI
    @bind x Slider(range(0, 1, length=10^8))
    end

    Observe the memory usage of the Julia process going up massively, on my PC up to around 12 GB before it is killed.

Expected Behavior

  1. High resolutions should not end up killing the process. Is there a good reason why this much allocation proportional to slider resolution is needed when a range specifies the possible values?
  2. An option for a maximum resolution slider would be most helpful. The HTML slider seems to have this option with the "any" step size (https://developer.mozilla.org/en-US/docs/Web/HTML/Element/input/range#setting_step_to_any).
  3. In the mean time, better guidance on how to choose an appropriate resolution for "continuous" variables should be given in examples and documentation. It seems to me that a length which roughly corresponds to typical screen resolutions like 10^3 or 2^11=2048 would be a good starting point.

Maybe a word on what my use case was: I was toying around with the plot of an ODE solution and wanted to continuously vary a parameter and some boundary conditions. Without much thinking, I set the slider step size to some arbitrary small number like 1e-6 and had no idea why my notebook was suddenly running so slow.

System Info

Pluto v0.19.5
PlutoUI v0.7.39
julia> versioninfo()
Julia Version 1.7.2
Commit bf53498635 (2022-02-06 15:21 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
  CPU: Intel(R) Core(TM) i7-10510U CPU @ 1.80GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-12.0.1 (ORCJIT, skylake)
$ lsb_release -d
Description:    Ubuntu 21.10
Firionus commented 2 years ago

Unfortunately, the high memory usage is flaky for me :bowl_with_spoon: :disappointed: I reproduced this multiple times on the day I opened the issue, but now I can't. Same versions of everything.

Firionus commented 2 years ago

I toyed around with it some more, and I can now reproduce the issue when I add show_value=true:

begin
    using PlutoUI
    @bind x Slider(range(0, 1, length=10^8), show_value=true)
end

Executing the cell quickly leads to an out of memory condition and the notebook process is killed.

The Pluto terminal shows the following error afterwards:

Worker 2 terminated.Distributed
.ProcessExitedExceptionUnhandled Task ERROR: EOFError: read end of file
Stacktrace:
 [1] (::Base.var"#wait_locked#645")(s::Sockets.TCPSocket, buf::IOBuffer, nb::Int64)
   @ Base ./stream.jl:892
 [2] unsafe_read(s::Sockets.TCPSocket, p::Ptr{UInt8}, nb::UInt64)
   @ Base ./stream.jl:900
 [3] unsafe_read
   @ ./io.jl:724 [inlined]
 [4] unsafe_read(s::Sockets.TCPSocket, p::Base.RefValue{NTuple{4, Int64}}, n::Int64)
   @ Base ./io.jl:723
 [5] read!
   @ ./io.jl:725 [inlined]
 [6] deserialize_hdr_raw
   @ ~/.julia/juliaup/julia-1.7.2+0~x64/share/julia/stdlib/v1.7/Distributed/src/messages.jl:167 [inlined]
 [7] message_handler_loop(r_stream::Sockets.TCPSocket, w_stream::Sockets.TCPSocket, incoming::Bool)
   @ Distributed ~/.julia/juliaup/julia-1.7.2+0~x64/share/julia/stdlib/v1.7/Distributed/src/process_messages.jl:165
 [8] process_tcp_streams(r_stream::Sockets.TCPSocket, w_stream::Sockets.TCPSocket, incoming::Bool)
   @ Distributed ~/.julia/juliaup/julia-1.7.2+0~x64/share/julia/stdlib/v1.7/Distributed/src/process_messages.jl:126
 [9] (::Distributed.var"#99#100"{Sockets.TCPSocket, Sockets.TCPSocket, Bool})()
   @ Distributed ./task.jl:423
(2)
Stacktrace:
  [1] worker_from_id(pg::Distributed.ProcessGroup, i::Int64)
    @ Distributed ~/.julia/juliaup/julia-1.7.2+0~x64/share/julia/stdlib/v1.7/Distributed/src/cluster.jl:1089
  [2] worker_from_id
    @ ~/.julia/juliaup/julia-1.7.2+0~x64/share/julia/stdlib/v1.7/Distributed/src/cluster.jl:1086 [inlined]
  [3] #remotecall_fetch#158
    @ ~/.julia/juliaup/julia-1.7.2+0~x64/share/julia/stdlib/v1.7/Distributed/src/remotecall.jl:496 [inlined]
  [4] remotecall_fetch
    @ ~/.julia/juliaup/julia-1.7.2+0~x64/share/julia/stdlib/v1.7/Distributed/src/remotecall.jl:496 [inlined]
  [5] remotecall_eval
    @ ~/.julia/juliaup/julia-1.7.2+0~x64/share/julia/stdlib/v1.7/Distributed/src/macros.jl:242 [inlined]
  [6] collect_soft_definitions(session_notebook::Tuple{Pluto.ServerSession, Pluto.Notebook}, modules::Set{Expr})
    @ Pluto.WorkspaceManager ~/.julia/packages/Pluto/AgKPu/src/evaluation/WorkspaceManager.jl:476
  [7] run_reactive_core!(session::Pluto.ServerSession, notebook::Pluto.Notebook, old_topology::Pluto.NotebookTopology, new_topology::Pluto.NotebookTopology, roots::Vector{Pluto.Cell}; deletion_hook::Function, user_requested_run::Bool, already_run::Vector{Pluto.Cell}, bond_value_pairs::Base.Iterators.Zip{Tuple{Vector{Symbol}, Vector{Any}}})
    @ Pluto ~/.julia/packages/Pluto/AgKPu/src/evaluation/Run.jl:216
  [8] run_reactive_core!
    @ ~/.julia/packages/Pluto/AgKPu/src/evaluation/Run.jl:70 [inlined]
  [9] (::Pluto.var"#255#259"{Bool, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, Pluto.ServerSession, Pluto.Notebook})()
    @ Pluto ~/.julia/packages/Pluto/AgKPu/src/evaluation/Run.jl:591
 [10] withtoken(f::Pluto.var"#255#259"{Bool, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, Pluto.ServerSession, Pluto.Notebook}, token::Pluto.Token)
    @ Pluto ~/.julia/packages/Pluto/AgKPu/src/evaluation/Tokens.jl:19
 [11] #254
    @ ~/.julia/packages/Pluto/AgKPu/src/evaluation/Run.jl:587 [inlined]
 [12] macro expansion
    @ ~/.julia/packages/Pluto/AgKPu/src/evaluation/Tokens.jl:58 [inlined]
 [13] (::Pluto.var"#225#226"{Pluto.var"#254#258"{Bool, Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, Pluto.ServerSession, Pluto.Notebook}})()
    @ Pluto ./task.jl:423

But I guess that's just a result of the process shutting down, not the cause of the high memory usage.

fonsp commented 2 years ago

Thanks! This is indeed caused by show_value=true, because we call string.(values) and send those rendered values to JavaScript to display: https://github.com/JuliaPluto/PlutoUI.jl/blob/47167739ec772924469d3528977e8d6a688b3319/src/Builtins.jl#L192

This is how we support something like Slider([sin, cos, tan]; show_value=true).

I think the best solution here is to resample the slider values on the Julia side. E.g. if there are 1,000,000 possible values, then it is fine to just slide over every 1000th value. That would also fix #215

fonsp commented 2 years ago

It would be helpful if someone wrote a function that does the following 💕

function downsample(x::AbstractVector, max_size::Integer)
    return something...
end

x1 = [1,2,3]
x2 = rand(500)

@test downsample(x1, 3) == [1,2,3]
@test downsample(x1, 30) == [1,2,3]
@test downsample(x1, 2) == [1,3]

@test downsample(x2, 500) == x2
y2 = downsample(x2, 400)
@test 250 <= length(y2) <= 400
@test y2[begin] == x2[begin]
@test y2[end] == x2[end] 

x3 = rand(50_000_000)
# this should take less than 0.1ms
downsample(x3, 100)
wilsongoode commented 1 year ago

@fonsp I created a Pluto notebook with my code for a downsample(x::AbstractVector, max_size::Integer) function and all the tests pass. What are my next steps for contributing this code? I'm unclear where in PlutoUI's source this function would live, but I assume I will need to fork PlutoUI in order to add this function and create a pull request. Is that correct?

Firionus commented 1 year ago

@wilsongoode That is great to hear! I've actually also been working on an algorithm but progress has been slow because I don't know much about discrete math and I've had other things in my life coming up. I've got a little interactive notebook of my version sitting here: https://gist.github.com/Firionus/ccd45b2daf8d305cd6f1ee9e4535e9e5

While it mostly does what it should, it sometimes misses better solutions allowed in the output length budget because it does not try out larger step sizes (see the "Problematic Examples" section and the TODO in the function). Would you mind sharing your solution in a Gist as well so we can compare our approaches?

wilsongoode commented 1 year ago

@Firionus Your algorithm is much more involved than what I created. I'd say I took the naive approach: https://gist.github.com/wilsongoode/3fd25d5f3a731ecb1037bd4098eb1645

Firionus commented 1 year ago

Wow! That is an incredibly concise nearest neighbor algorithm, I'm impressed.

I've been thinking some more about my approach. The basic idea is that when you downsample from an array of length N to something that has at most length n_max, you sometimes can achieve a really even downsampling where almost all downsampling steps are the same size by choosing an output length somewhat smaller than n_max.
For example, when length(x) == 50_000_000 and n_max = 100, you can downsample to a length of 100 with a nearest neighbor algorithm like yours, but you'll get constant switching between steps of size 505050 and steps of size 505051. Wouldn't it be nice if all steps are the same size, creating an output that is more linear compared to a linear input? That would be possible with an output size of n = 81, resulting in 79 steps of size 625000 and only 1 step in the middle of size 624999.

The problem is that I'm not sure how to find these "nice" numbers quickly in all cases. A problem statement might look something like this:

Given integers N >= 2, n_max >= 1, n_min >= 1 with N > n_max >= n_min, find a tuple of non-negative integers (n_s, n_l, s) subject to s >= 1, N = n_s*s + n_l*(s+1) and n_min <= n_s + n_l <= n_max such that either n_s or n_l is minimized. With multiple solutions of same merit, prefer one where n_s + n_l is maximized.

Here N = length(x) - 1 (number of steps in input) and n_max is the maximum number of steps in the output (so one smaller than n_max in Fons's problem statement). n_s is the number of small steps and n_l is the number of large steps. s is the size of the small step and the large step is one bigger.
Unfortunately I just don't have the chops in integer optimization to properly solve this but I have a feeling a general solution would be hard to find in short time.

Therefore, I'd suggest we stick with a simple nearest neighbor algorithm for now where the output length is just n_max. That way, we don't end up with half-baked solutions in the code base that no one understands.

Finding a simpler solution

As I said, I really like the brevity of @wilsongoode's solution, but performance is about half as fast as what I currently have. The main issue is that round.(...) allocates an array for the indices. This is easily fixed by changing to an array comprehension, bringing performance to the level of my solution at much shorter code length:

image

So, my suggestions based on @wilsongoode's is the following:

function downsample(x::AbstractVector, max_size::Integer)
    if max_size >= length(x)
        x
    else
        [x[round(Int, i)] for i in 
        range(firstindex(x), lastindex(x), length=max_size)]
    end
end

How to contribute

As for contributing this change, I think we just need to put something like values = downsample(values, max_size) before https://github.com/JuliaPluto/PlutoUI.jl/blob/47167739ec772924469d3528977e8d6a688b3319/src/Builtins.jl#L161

I guess max_size should be configurable in the constructor for Slider and documented. A default of 1_000 wouldn't be completely wrong. And a test to hit the downsampling in CI would probably be a good idea as well. For example I can't test right now whether the range syntax I used is supported in v1.4. The suggested tests from fons also should be put somewhere. Unfortunately I have no idea right now how tests are handled in this package, so you're gonna have to go looking a bit.

I assume I will need to fork PlutoUI in order to add this function and create a pull request. Is that correct?

Yeah. I'd leave that to you, @wilsongoode, if that is okay? Feel free to just open it as draft and mention this issue if you want me to take a look at your code.