Closed blake-riley closed 4 months ago
So I know all tests are failing here (but it runs on our test set), but I am also getting some really weird behavior run behavior where qFit will just quit half way through a protein. Like run 40/150 residues then just die with no error message, no output, ect. Any thoughts @blake-riley?
Failing tests Yeah, solver tests are currently failing because CVXOpt is ... not solving to the tolerance we request of it in the options? Not ideal, but not the end of the world.
Quitting half way Very weird. The only thing that I can think of is in qfit_protein.py at L564-5:
# #TODO If a task crashes or is OOM killed, then there is no result.
# f.wait waits forever. It would be good to handle this case.
In essence, if a PoolWorker dies (because it runs out of memory and the kernel kills it), the Pool manager never finds out, never tries to resurrect it, and so you end up with an empty Pool. This might be what looks like qFit just quitting half way through a protein? [^1]
Can you re-run those residues with --nproc 1? If so, each residue gets run in the main process (no PoolWorkers, just MainProcess), so you'll see all the crash info (and you can debug it).
[^1]: > 8 little PoolWorkers went out one day Processing jobs on cores 1 through 8 Mother Duck said quack quack quack quack..., ... quack quack quack quack ... quack ...
Pull Request Checklist
dev
branch?Exceptions will be made for urgent bugfixes.
dev
?If not, please rebase your PR onto the most recent
dev
tip.Explain to a new user by completing the sentence: 'This PR will: ...'
Description of the Change
Fixes #374 by making osqp and miosqp install by default with pip. Unsure about conda? @stephaniewankowicz might need to check this one with a fresh install
NB: If the user doesn't specify
--qp-solver
or--miqp-solver
, cvxopt and cplex are still preferred if they are available (the order of defaults is the order the solvers are defined in src/qfit/solvers.py).Release Notes