Closed mruffalo closed 3 years ago
Stitched pixel dimensions: 12665 ⨉ 7481
@mruffalo - do we have a target date for resolving this issue affecting CODEX dataset processing?
@pecan88 We're testing a fix that seems to resolve the issue, and are testing on a full-size data set now. This SPRM invocation has been running for about a day and a half, and this failure usually manifested much earlier, so this seems very promising -- but we'd still like to see this run succeed before tagging a release. (Testing on a small dataset is much faster but also isn't informative -- this issue never manifested in small test datasets.)
Processing of a CODEX data set failed at the SPRM step, due to an apparent memory allocation failure in the relevant BLAS library:
The
docker
command line was built bycwltool
, and is just included for lack of a reason not to. TheBLAS : Program is Terminated. Because you tried to allocate too many memory regions.
message was repeated more than 30 times before execution stopped.This was run on the
l002
compute node, which has almost 3TB memory.