slipstream / SlipStreamJobEngine

SlipStream distributed job engine
Apache License 2.0
0 stars 0 forks source link

SlipStreamJobEngine

SlipStream job engine use cimi job resource and zookeeper as a locking queue. It's done in a way to be horizontally scalled on different nodes.

Facts:

Run the slipstream executor

Install the rpm of SlipStreamJobEngine

Create a file /etc/default/slipstream-job-executor with following content:

DAEMON_ARGS='--ss-url=https://<CIMI_ENDPOINT>:<CIMI_PORT> --ss-user=super --ss-pass=<SUPER_PASS> --zk-hosts=<ZOOKEEPER_ENDPOINT>:<ZOOKEEPER_PORT> --threads=8 --es-hosts-list=<ELASTICSEARCH_ENDPOINTS>'

Start the service with systemctl start slipstream-job-executor

Run the slipstream distributors

Install the rpm of SlipStreamJobEngine

Create a file /etc/default/slipstream-job-distributor with following content:

DAEMON_ARGS='--ss-url=https://<CIMI_ENDPOINT>:<CIMI_PORT> --ss-user=super --ss-pass=<SUPER_PASS> --zk-hosts=<ZOOKEEPER_ENDPOINT>:<ZOOKEEPER_PORT>'

Start the service with systemctl start slipstream-job-distributor@<DISTRIBUTOR_SCRIPT_FILENAME_LAST_PART>

e.g systemctl start slipstream-job-distributor@jobs_cleanup.service

Implement new actions

To implement new actions to be executed by job executor, you have to create a class equivalent to actions/dummy_test_action.py. You have to restart the job executor to force it reload implemented actions.

To create a new action distributor, which will create a cimi job every x time. Create a class equivalent to scripts/job_distributor_dummy_test_action.py.

Logging

Check /var/log/slipstream/log/ folder.

Debugging

You can get a trace-back of all running threads using tools like https://pyrasite.readthedocs.io

  1. pip install pyrasite

  2. Get python process PID of the executor e.g.

  3. Connect to slipstream bash session: su - slipstream

  4. pyrasite-shell

  5. print traceback with entering code below into pyrasite repl

    
    import sys
    import threading
    import traceback

for th in threading.enumerate(): print(th) traceback.print_stack(sys._current_frames()[th.ident]) print()