In a high-contention situation (96 threads, many small files) I am getting the following output when downloading files via AWS/AWSS3.jl:
┌ Error: curl_multi_socket_action: 8
└ @ Downloads.Curl ~/.julia/juliaup/julia-1.9.4+0.x64.linux.gnu/share/julia/stdlib/v1.9/Downloads/src/Curl/utils.jl:57
┌ Error: curl_multi_socket_action: 8
└ @ Downloads.Curl ~/.julia/juliaup/julia-1.9.4+0.x64.linux.gnu/share/julia/stdlib/v1.9/Downloads/src/Curl/utils.jl:57
┌ Error: curl_multi_socket_action: 8
└ @ Downloads.Curl ~/.julia/juliaup/julia-1.9.4+0.x64.linux.gnu/share/julia/stdlib/v1.9/Downloads/src/Curl/utils.jl:57
┌ Error: curl_multi_socket_action: 8
└ @ Downloads.Curl ~/.julia/juliaup/julia-1.9.4+0.x64.linux.gnu/share/julia/stdlib/v1.9/Downloads/src/Curl/utils.jl:57
┌ Error: curl_multi_socket_action: 8
└ @ Downloads.Curl ~/.julia/juliaup/julia-1.9.4+0.x64.linux.gnu/share/julia/stdlib/v1.9/Downloads/src/Curl/utils.jl:57
┌ Error: curl_multi_socket_action: 8
└ @ Downloads.Curl ~/.julia/juliaup/julia-1.9.4+0.x64.linux.gnu/share/julia/stdlib/v1.9/Downloads/src/Curl/utils.jl:57
┌ Error: curl_multi_socket_action: 8
└ @ Downloads.Curl ~/.julia/juliaup/julia-1.9.4+0.x64.linux.gnu/share/julia/stdlib/v1.9/Downloads/src/Curl/utils.jl:57
┌ Error: curl_multi_socket_action: 8
└ @ Downloads.Curl ~/.julia/juliaup/julia-1.9.4+0.x64.linux.gnu/share/julia/stdlib/v1.9/Downloads/src/Curl/utils.jl:57
┌ Error: curl_multi_socket_action: 8
└ @ Downloads.Curl ~/.julia/juliaup/julia-1.9.4+0.x64.linux.gnu/share/julia/stdlib/v1.9/Downloads/src/Curl/utils.jl:57
┌ Error: curl_multi_socket_action: 8
└ @ Downloads.Curl ~/.julia/juliaup/julia-1.9.4+0.x64.linux.gnu/share/julia/stdlib/v1.9/Downloads/src/Curl/utils.jl:57
┌ Error: curl_multi_socket_action: 8
└ @ Downloads.Curl ~/.julia/juliaup/julia-1.9.4+0.x64.linux.gnu/share/julia/stdlib/v1.9/Downloads/src/Curl/utils.jl:57
┌ Error: curl_multi_socket_action: 8
└ @ Downloads.Curl ~/.julia/juliaup/julia-1.9.4+0.x64.linux.gnu/share/julia/stdlib/v1.9/Downloads/src/Curl/utils.jl:57
┌ Error: curl_multi_socket_action: 8
└ @ Downloads.Curl ~/.julia/juliaup/julia-1.9.4+0.x64.linux.gnu/share/julia/stdlib/v1.9/Downloads/src/Curl/utils.jl:57
┌ Error: curl_multi_socket_action: 8
└ @ Downloads.Curl ~/.julia/juliaup/julia-1.9.4+0.x64.linux.gnu/share/julia/stdlib/v1.9/Downloads/src/Curl/utils.jl:57
┌ Error: curl_multi_socket_action: 8
└ @ Downloads.Curl ~/.julia/juliaup/julia-1.9.4+0.x64.linux.gnu/share/julia/stdlib/v1.9/Downloads/src/Curl/utils.jl:57
ERROR: TaskFailedException
Stacktrace:
[1] wait
@ ./task.jl:349 [inlined]
[2] fetch
@ ./task.jl:369 [inlined]
[3] fetch
@ ~/.julia/packages/StableTasks/3CrzR/src/internals.jl:9 [inlined]
[4] macro expansion
@ ./reduce.jl:260 [inlined]
[5] macro expansion
@ ./simdloop.jl:77 [inlined]
[6] mapreduce_impl(f::typeof(fetch), op::OhMyThreads.Implementation.var"#99#100", A::Vector{StableTasks.StableTask{Nothing}}, ifirst::Int64, ilast::Int64, blksize::Int64)
@ Base ./reduce.jl:258
[7] mapreduce_impl
@ ./reduce.jl:272 [inlined]
[8] _mapreduce(f::typeof(fetch), op::OhMyThreads.Implementation.var"#99#100", #unused#::IndexLinear, A::Vector{StableTasks.StableTask{Nothing}})
@ Base ./reduce.jl:442
[9] _mapreduce_dim(f::Function, op::Function, #unused#::Base._InitialValue, A::Vector{StableTasks.StableTask{Nothing}}, #unused#::Colon)
@ Base ./reducedim.jl:365
[10] #mapreduce#801
@ ./reducedim.jl:357 [inlined]
[11] mapreduce(f::Function, op::Function, A::Vector{StableTasks.StableTask{Nothing}})
@ Base ./reducedim.jl:357
[12] _tmapreduce(f::Function, op::Function, Arrs::Tuple{Vector{AWSS3.S3Path{Nothing}}, Vector{DataFrames.AbstractDataFrame}}, #unused#::Type{Nothing}, scheduler::OhMyThreads.Schedulers.DynamicScheduler{OhMyThreads.Schedulers.FixedCount}, mapreduce_kwargs::NamedTuple{(:init,), Tuple{Nothing}})
@ OhMyThreads.Implementation ~/.julia/packages/OhMyThreads/V13wc/src/implementation.jl:96
[13] tmapreduce(::Function, ::Function, ::Vector{AWSS3.S3Path{Nothing}}, ::Vararg{Any}; scheduler::OhMyThreads.Schedulers.NotGiven, outputtype::Type, init::Nothing, kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
@ OhMyThreads.Implementation ~/.julia/packages/OhMyThreads/V13wc/src/implementation.jl:68
[14] tforeach(::Function, ::Vector{AWSS3.S3Path{Nothing}}, ::Vararg{Any}; kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
@ OhMyThreads.Implementation ~/.julia/packages/OhMyThreads/V13wc/src/implementation.jl:294
[15] tforeach
@ ~/.julia/packages/OhMyThreads/V13wc/src/implementation.jl:293 [inlined]
...more stacktrace...
nested task error: AWS.AWSExceptions.AWSException: RequestTimeout -- Your socket connection to the server was not read from or written to within the timeout period. Idle connections will be closed.
HTTP.Exceptions.StatusError(400, "PUT", "...[elided]...", HTTP.Messages.Response:
"""
HTTP/1.1 400 Bad Request
x-amz-request-id: N1J1KZKTENGH5DNK
x-amz-id-2: l3Wg7BNYvxe4fLCWkK12DYVBaFK1USQHL6rGjzTdGbNnU7LnhF2TWL/XDjpjUjZvJAXVKvPnc+4=
content-type: application/xml
transfer-encoding: chunked
date: Wed, 29 May 2024 15:34:22 GMT
server: AmazonS3
connection: close
[Message Body was streamed]""")
<?xml version="1.0" encoding="UTF-8"?>
<Error><Code>RequestTimeout</Code><Message>Your socket connection to the server was not read from or written to within the timeout period. Idle connections will be closed.</Message><RequestId>N1J1KZKTENGH5DNK</RequestId><HostId>l3Wg7BNYvxe4fLCWkK12DYVBaFK1USQHL6rGjzTdGbNnU7LnhF2TWL/XDjpjUjZvJAXVKvPnc+4=</HostId></Error>
Unfortunately the code and some parts of the stack trace include proprietary info I cannot post publicly. However, at a guess a simple use of Threads.@spawn with many files uploaded/downloaded via AWS.jl/AWSS3.jl (or even just using Download.jl directly) with many threads running concurrently would also trigger this, as the basic step where this happens is not too complicated. I will try to create an MWE when I have the bandwidth.
Of course, I quickly learned one can easily fix this via asyncmap etc... but it would be nice if this weren't necessary.
In a high-contention situation (96 threads, many small files) I am getting the following output when downloading files via AWS/AWSS3.jl:
Unfortunately the code and some parts of the stack trace include proprietary info I cannot post publicly. However, at a guess a simple use of
Threads.@spawn
with many files uploaded/downloaded via AWS.jl/AWSS3.jl (or even just using Download.jl directly) with many threads running concurrently would also trigger this, as the basic step where this happens is not too complicated. I will try to create an MWE when I have the bandwidth.Of course, I quickly learned one can easily fix this via
asyncmap
etc... but it would be nice if this weren't necessary.