sys-bio / roadrunner

libRoadRunner: A high-performance SBML simulator
http://libroadrunner.org/
Other
36 stars 24 forks source link

Segmentation Fault error when running in cluster #1183

Open dalbabur opened 5 months ago

dalbabur commented 5 months ago

I'm trying to run many models in parallel for data fitting in a cluster (Hyak) using ipyparallel. However, I keep running into cryptic segmentation fault errors when loading model binaries in the remote nodes. No errors when loading a new SBML.

Checking the logs, the main difference I see between loading a new SBML and loadState is that: 1) the log doesn’t contain any of the “Selection Record” and “Created default TimeCourse selection” traces 2) “NamedArrayObject_alloc” spits “finalizing object self: 0x14ef79d84570; args 0x55a907b77d00” which is what must be throwing the Segmentation Fault!

luciansmith commented 5 months ago

Thank you for the report!

Do you have a script you could upload? Don't worry about it being too complicated; we can pare it down when testing. If something in it is proprietary, we'll try to recreate it on our own, but it'd be nice to have something to start from.

dalbabur commented 5 months ago

this is the minimal code that throws the error:

def load_sbml(model_file):
    import tellurium as te  
    r = te.loadSBMLModel(modelfile)
    r.simulate(0,10,100)
    return

def load_binary(model_file):
    import tellurium as te  
    r = te.roadrunner.ExtendedRoadRunner()
    r.loadState(model_file)
    r.simulate(0,10,100)
    return

import ipyparallel as ipp
# setup cluster and view
lbview.apply_sync(load_sbml('model.sbml')) # this works
lbview.apply_sync(load_binary('model.b')) # this does not work