OxygenFramework / Oxygen.jl

💨 A breath of fresh air for programming web apps in Julia
https://oxygenframework.github.io/Oxygen.jl/
MIT License
381 stars 25 forks source link

Unexpected multithreading behavior #207

Open t0ralf opened 1 week ago

t0ralf commented 1 week ago

I'm not sure if this is a bug or expected behavior. If the latter is the case I would like to hear an explanation.

Consider the following sample code:


using Oxygen
using HTTP

example_vector = rand(1:100,500000000)

function single_thread_calc(v)
  s = 0.0
  for i = 1:length(v)
      s += sin(v[i])
  end
  return s
end

@get "/calc" function(req::HTTP.Request) 
    @show Threads.threadid()
    sum=single_thread_calc(example_vector)
    return "hello, your sum is $sum"
end

serveparallel()

The execution of this endpoint will take about 7s on my PC (Windows 11, 8 Cores, 16 Threads, Julia 1.10.3, Oxygen 1.5.11, HTTP 1.10.8)

Now, if I run this code in a julia session with 4 threads assigned (i.e. julia --threads 4) and call the endpoint 4 times in quick succession (for example via swagger in 4 different browser tabs), I would expect a parallel execution of these 4 requests. What happens though is that the first request is executed and awaited and then the other 3 requests are run in parallel. This is repeatable behavior. Why does it always wait until the first request is finished before running a parallel execution? This would not be great behavior for a production scenario with a decent amount of traffic.

Thanks for the help!

ndortega commented 1 week ago

Hi @t0ralf,

Thanks for the sample, I've noticed the behavior depends on how you test it. I personally noticed different behavior when using a load testing tool vs manual testing.

The code to support multithreading is so embarrassingly simple I'm not sure where it could be going wrong. I have nagging suspicion that it might just be the scheduler doing its thing - but I'm not 100% sure.

Regardless, this isn't the intended behavior - below is the code where all the parallelism happens. Let me know if anything here stands out as weird or incorrect

"""
    parallel_stream_handler(handle_stream::Function)

This function uses `Threads.@spawn` to schedule a new task on any available thread. 
Inside this task, `@async` is used for cooperative multitasking, allowing the task to yield during I/O operations. 
"""
function parallel_stream_handler(handle_stream::Function)
    function (stream::HTTP.Stream)
        task = Threads.@spawn begin
            handle = @async handle_stream(stream)
            wait(handle)
        end
        wait(task)
    end
end
t0ralf commented 1 week ago

Hi @ndortega,

thanks for your reply.

Can you tell me which load testing tool you used?

t0ralf commented 1 week ago

Hi again,

Given the implementation you posted I started experimenting a little bit with "plain" Julia code (aka without a possible Oxygen overhead) and especially the @async macro. I read this Stackoverflow thread and given the answer in that thread I started wondering if there might be an "issue" with that.

So, I wrote 2 scripts:

Threads.@spawn begin
  println("version 1") 
  single_thread_calc(example_vector)
  println("finish 1")
end 

Threads.@spawn begin
  println("version 2")
  single_thread_calc(example_vector)
  println("finish 2")
end 
Threads.@spawn begin
  println("version 3")  
  single_thread_calc(example_vector)
  println("finish 3")
end 

as well as

Threads.@spawn begin
  println("version 1") 
  @async single_thread_calc(example_vector)
  println("finish 1")
end 

Threads.@spawn begin
  println("version 2")
  @async single_thread_calc(example_vector)
  println("finish 2")
end 
Threads.@spawn begin
  println("version 3")  
  @async single_thread_calc(example_vector)
  println("finish 3")
end 

Now, running the first script shows expected multithreading behavior (at least it met my expectation) while the second script again showed what I consider weird multithreading behavior where the execution clearly waits for something to be finished before continuing.

Since I am a far cry from a Julia expert, I can't really explain what I'm seeing here. But maybe that is of help to anyone with a deeper knowledge of Julia than me.

ndortega commented 1 week ago

Thanks for creating these examples!

I'm also not a julia expert (I just learn as I go), but I noticed that HTTP.jl also executes handlers with @async internally. If i had to guess, then it would be that the nested @async calls is causing this weird scheduling behavior (between parent and child tasks)

Checkout this branch and let me know if the issue is still there. You can try out this branch in your app by adding the oxygen dependency like this:

add https://github.com/OxygenFramework/Oxygen.jl.git#bugfix/weird-multithreading-behavior
t0ralf commented 5 days ago

Thanks for the branch.

Unfortunately, it does not solve the issue. I also have to backpedal on my previous post regarding the @async macro. When I change the code to

Threads.@spawn begin
  println("version 1") 
  task = @async single_thread_calc(example_vector)
  wait(task)
  println("finish 1")
end 

Threads.@spawn begin
  println("version 2")
   task = @async single_thread_calc(example_vector)
  wait(task)
  println("finish 2")
end 
Threads.@spawn begin
  println("version 3")  
   task = @async single_thread_calc(example_vector)
  wait(task)
  println("finish 3")
end 

it yields the correct multithreading behavior.

So I started a little more systematic approach and automated the API calls via a script. There I observed what I think you described in your first post: When I call the endpoint multiple times simultaneously it executes the endpoint in a multihreaded manner as I would expect. So no issues there. That also tells me that, at least in general, the implementation of the multithreading is not really the issue here.

However, if, in the same script, I simulate the manual calling of the endpoint by introducing a small delay of 0.5s between each call, it then again behaves in the undesired way I described in my initial post. It never executes all calls in parallel like I would expect.