Open mario-amazing opened 1 year ago
I'd say try running without yjit and see if that is better.
Otherwise reproduce in a minimal example to see if it's really parallel
causing that or just fork
.
@grosser, without yjit I have the same error.
/app/vendor/bundle/ruby/3.2.0/gems/parallel-1.23.0/lib/parallel.rb:568: [BUG] Segmentation fault at 0x0000000000000010
ruby 3.2.2 (2023-03-30 revision e51014f9c0) [x86_64-linux-musl]
-- Control frame information -----------------------------------------------
c:0078 p:---- s:0432 e:000431 CFUNC :fork
c:0077 p:0027 s:0428 e:000427 METHOD /app/vendor/bundle/ruby/3.2.0/gems/parallel-1.23.0/lib/parallel.rb:568
c:0076 p:0019 s:0416 e:000414 BLOCK /app/vendor/bundle/ruby/3.2.0/gems/parallel-1.23.0/lib/parallel.rb:559 [FINISH]
c:0075 p:---- s:0410 e:000409 IFUNC
c:0074 p:---- s:0407 e:000406 CFUNC :each
c:0073 p:---- s:0404 e:000403 CFUNC :each_with_index
c:0072 p:0017 s:0400 e:000399 METHOD /app/vendor/bundle/ruby/3.2.0/gems/parallel-1.23.0/lib/parallel.rb:558
c:0071 p:0011 s:0392 e:000391 METHOD /app/vendor/bundle/ruby/3.2.0/gems/parallel-1.23.0/lib/parallel.rb:497
c:0070 p:0317 s:0381 e:000380 METHOD /app/vendor/bundle/ruby/3.2.0/gems/parallel-1.23.0/lib/parallel.rb:291
c:0069 p:0019 s:0370 e:000369 METHOD /app/vendor/bundle/ruby/3.2.0/gems/parallel-1.23.0/lib/parallel.rb:235
With threads everything is fine
{ in_threads: Etc.nprocessors }
try running the same code with just fork
to see if this is a ruby
bug or a parallel
bug
and try to build a minimal example so it's easy to reproduce/debug
I've tried parallels in_processes
and fork
in console. Everything is fine. But this issue is rare on the production server and I don't know how to catch it. => Only errors in the log file
it looks like the is running in sidekiq which means it's also running in parallel threads already right ?
maybe a solution is to use Thread.exclusive
to eliminate that ?
... other idea would be to "dumb down" whatever it does inside, so instead of writing to disk/db just return a string and then write that in serial after the block, basically reduce the block to the essential slow part and do the rest outside
This is a separate sidekiq docker container. Inside of parallel execution: external API request and handling data from it
maybe the api requests could be done in parallel threads and then data processing in forks
it looks like the is running in sidekiq which means it's also running in parallel threads already right ?
I'm also seeing this from time to time, but wasn't successful yet to cleanly reproduce this.
maybe a solution is to use
Thread.exclusive
to eliminate that ?
I think Thread.exclusive
has been deprecated a long time ago and was removed some time ago (at least in recent Rubys it is no longer available). Where would you apply a mutex exactly?
In our case, we are using processes, as the task we're doing is pretty CPU intensive and benefits greatly from true parallelism. I suspect that some combination of Sidekiq's threads + forking from parallel triggers an issue in some cases, especially if there are multiple sidekiq threads doing Parallel.map(in_processes: …)
. This also seems to cause parallel's processes to hang sometimes, but again, wasn't able to reproduce this outside of production and troubleshooting this is pretty messy currently 😢
I have Segmentation fault error in the sidekiq container:
Docker image:
FROM ruby:3.2.2-alpine
Any ideas how to fix it?