JuliaParallel / ClusterManagers.jl

Other
242 stars 74 forks source link

Error in `rmprocs` SGE #152

Closed nantonel closed 3 years ago

nantonel commented 3 years ago

After running successfully some jobs on the grid when I use:

rmprocs(workers())

I get a Warning: Forcibly interrupting busy workers followed by an error.

This appears to be generated in the function kill of qsub.jl .

function kill(manager::Union{PBSManager, SGEManager, QRSHManager}, id::Int64, config::WorkerConfig)
    remotecall(exit,id)
    close(get(config.io))

    kill(get(config.userdata)[:process],15)

    isa(manager, QRSHManager) && return

    if isfile(get(config.userdata)[:iofile])
        rm(get(config.userdata)[:iofile])
    end
end

It seems close(get(config.io)) is generating the problem, probably due to a deprecated Julia syntax.

I fixed the problem using the following:

function kill(manager::Union{PBSManager, SGEManager, QRSHManager}, id::Int64, config::WorkerConfig)
    remotecall(exit,id)
    close(config.io)

    if isa(manager, QRSHManager)
      kill(config.userdata[:process],15)
      return
    end

    if isfile(config.userdata[:iofile])
        rm(config.userdata[:iofile])
    end
end

~Notice that I removed kill(get(config.userdata)[:process],15). I believe this is taken care of by close(config.io) as config.io is Process(tail -f path/to/julia-28290.o5670753.3, ProcessRunning).~ Edit: apparently that's related to QRSHManager.

bjarthur commented 3 years ago

closed by https://github.com/JuliaParallel/ClusterManagers.jl/pull/153