POETSII / Orchestrator

The Orchestrator is the configuration and run-time management system for POETS platforms.
1 stars 1 forks source link

Orchestrator Segfaults if libsupervisor.so fails to build/is not found #136

Closed heliosfa closed 3 years ago

heliosfa commented 4 years ago

Logging so that I don't forget about this down the line.

On Development ( 8c629a97d71313c11aa2dcd9f1a9b1d34735319e ), if the Orchestrator cannot find libsupervisor.so (say because the build failed), executing task /run on the task (say automatically with a call file) causes a seg fault.

task /init correctly throws an error: POETS> 16:09:48.57: 532(E) Supervisor /home/gmb/.orchestrator/task_binaries/plate_5x5/libSupervisor.so could not be dynamically loaded on mothership 3 because of error /home/gmb/.orchestrator/task_binaries/plate_5x5/libSupervisor.so: cannot open shared object file: No such file or directory

But run is much less happy:

 16:04:00.02: 801(D) P_builder::Add(name=plate_5x5,file=/home/gmb/Orchestrator/application_staging/xml/plate_5x5.xml)
POETS> 16:04:00.02: 805(E) P_builder:compile failed. Errors dumped to make_errs.txt
POETS>Sending a distribution message to mothership with 3 cores
 16:04:13.84: 532(E) Supervisor /home/gmb/.orchestrator/task_binaries/plate_5x5/libSupervisor.so could not be dynamically loaded on mothership 3 because of error /home/gmb/.orchestrator/task_binaries/plate_5x5/libSupervisor.so: cannot open shared object file: No such file or directory
POETS>
===================================================================================
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   PID 17729 RUNNING AT ayres
=   EXIT CODE: 139
=   CLEANING UP REMAINING PROCESSES
=   YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
===================================================================================
YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Segmentation fault (signal 11)
This typically refers to a problem with your application.
Please see the FAQ page for debugging suggestions
gmb@ayres:~/Orchestrator$

A broken XML that triggers this is attached. plate_5x5.xml.txt

mvousden commented 3 years ago

deploy /app (formerly task /run) will prematurely exit if the application has not been composed. It is no longer possible to compose (without libsupervisor.so) without encountering an error.