:toc: macro :toclevels: 3 :toc-title:
Written in Java and scripted using JavaScript.
Using a general-purpose scripting language for defining workflows removes most configuration boilerplate.
Simple and robust architecture that keeps running even when Hadoop is down and picks up where it left off when it's back up.
Support for continuous delivery of workflows.
Parsimonious user interface that nevertheless gives a good overview over hundreds of workflows.
Battle-tested for years at link:http://www.collective.com[Collective, Inc] and found practically defect-free.
'''
toc::[]
'''
== Overview
image::etc/images/celos-ui.png[Celos UI]
The Celos UI
=== Oozie Overview
Celos works in conjunction with the link:https://oozie.apache.org/[_Apache Oozie_] execution engine, so we'll quickly describe Oozie before diving into Celos proper.
Oozie provides services such as distribution of jobs across a cluster, log aggregation, and command-line and Web UI tools for inspecting running jobs.
To run a job in Oozie it is packaged into a workflow directory and put into HDFS, as the following diagram shows:
image::etc/images/oozie.png[Oozie workflow directory]
Oozie requires a small XML file that describes what the job should do
(in this case call a Java main with some arguments), and then runs it
on a processing node in the cluster. Artefacts in the lib
directory
are automatically placed on the classpath. Arguments (such as
inputPath
and outputPath
in the above example) can be passed to
the job.
Oozie XML files support link:https://oozie.apache.org/docs/4.2.0/WorkflowFunctionalSpec.html[many different kinds of actions] besides calling Java, such as manipulating HDFS, loading data into databases, or sending emails. This is mostly out of scope for this document, but in general, Celos workflows can use all features of Oozie.
You can view all currently running jobs in the Oozie Web UI, Hue:
image::etc/images/hue-index.png[Hue index]
You can also view details of a particular job in Hue:
image::etc/images/hue-details.png[Hue details]
=== Celos Concepts
==== Workflows
If we have the Oozie workflow directory to run in HDFS at
/wordcount
, hourly input data in /input/YYYY-MM-DD/HH00
, and want
to write output data to /output/YYYY-MM-DD/HH00
, we can set up a
simple Celos workflow with the ID wordcount
like this:
[source,javascript] .... celos.defineWorkflow({ "id": "wordcount", "schedule": celos.hourlySchedule(), "schedulingStrategy": celos.serialSchedulingStrategy(), "trigger": celos.hdfsCheckTrigger("/input/${year}-${month}-${day}/${hour}00/_READY"), "externalService": celos.oozieExternalService({ "oozie.wf.application.path": "/wordcount/workflow.xml", "inputPath": "/input/${year}-${month}-${day}/${hour}00/", "outputPath": "/output/${year}-${month}-${day}/${hour}00/", }) }); ....
A Celos workflow always has an Oozie workflow
(/wordcount/workflow.xml
in this case) as its "meat".
If we were to receive data from two sources, say two datacenters in
/input/nyc
and /input/lax
, we can define a helper function and use
that to quickly define two workflows with the IDs wordcount-nyc
and
wordcount-lax
:
[source,javascript] .... function defineWordCountWorkflow(dc) { celos.defineWorkflow({ "id": "wordcount-" + dc, "schedule": celos.hourlySchedule(), "schedulingStrategy": celos.serialSchedulingStrategy(), "trigger": celos.hdfsCheckTrigger("/input/" + dc + "/${year}-${month}-${day}/${hour}00/_READY"), "externalService": celos.oozieExternalService({ "oozie.wf.application.path": "/wordcount/workflow.xml", "inputPath": "/input/" + dc + "/${year}-${month}-${day}/${hour}00/", "outputPath": "/output/" + dc + "/${year}-${month}-${day}/${hour}00/", }) }); } defineWordCountWorkflow("nyc"); defineWordCountWorkflow("lax"); ....
Here's an overview over schedules, triggers, and scheduling strategies, described below:
image::etc/images/slots.png[Celos concepts]
==== Schedules
Each workflow has a schedule that determines the points in time (called slots) at which the workflow should run.
Celos supports cron
-like schedules with <
[source,javascript] .... // A workflow using this schedule will run every hour. celos.cronSchedule("0 0 ?"); // A workflow using this schedule will run every day at midnight. celos.cronSchedule("0 0 0 ?"); // A workflow using this schedule will run every day at 5am. celos.cronSchedule("0 0 5 * ?"); ....
Another type of schedule is <
==== Triggers
For each slot of a workflow, a trigger is used to determine whether it's ready to run, or needs to wait.
===== Simple Triggers
Let's look at some commonly used simple triggers.
<
[source,javascript] .... // A slot at time T will wait for the file /logs/YYYY-MM-DD/HH00/_READY in HDFS. celos.hdfsCheckTrigger("/logs/${year}-${month}-${day}/${hour}00/_READY"); ....
<
[source,javascript] .... // A slot at time T will wait until the slot at time T of // the workflow with the ID "workflow-foo" is successful. celos.successTrigger("workflow-foo") ....
<
[source,javascript] .... // A slot at time T will wait until the current time is one hour after T. celos.delayTrigger(60 * 60) ....
<
[source,javascript] .... // A slot at time T will wait until the next hour's file is available in HDFS. celos.offsetTrigger(60 * 60, celos.hdfsCheckTrigger("/logs/${year}-${month}-${day}/${hour}00/_READY")); ....
===== Combined Triggers
We can also combine triggers with <
[source,javascript] .... // A slot at time T will wait until both /input-a/YYYY-MM-DD/HH00/_READY // and /input-b/YYYY-MM-DD/HH00/_READY is in HDFS. celos.andTrigger(celos.hdfsCheckTrigger("/input-a/${year}-${month}-${day}/${hour}00/_READY"), celos.hdfsCheckTrigger("/input-b/${year}-${month}-${day}/${hour}00/_READY")); ....
[source,javascript] .... // A slot at time T will wait until the current hour's file, the next hour's file, // and the file for the hour after that are in HDFS. var hdfsCheck = celos.hdfsCheckTrigger("/logs/${year}-${month}-${day}/${hour}00/_READY"); celos.andTrigger(hdfsCheck, celos.offsetTrigger(60 60 1, hdfsCheck), celos.offsetTrigger(60 60 2, hdfsCheck)); ....
[source,javascript] .... // A slot at time T will be ready if, after one hour, the slot at time T // of the other workflow "workflow-bar" is not successful. // This can be used to send an alert for example. celos.andTrigger(celos.delayTrigger(60 * 60), celos.notTrigger(celos.successTrigger("workflow-bar")); ....
This last trigger should be used in conjunction with a
<<celos.dependentSchedule,celos.dependentSchedule("workflow-bar")
>>.
==== Scheduling Strategies
A workflow's scheduling strategy determines when and in which order the ready slots of the workflow should be run.
There's only one scheduling strategy at the moment,
<
[source,javascript] .... // A workflow using this scheduling strategy will run three slots in parallel. celos.serialSchedulingStrategy(3); ....
=== How it Works
The main data sources Celos uses are:
==== Workflows Directory
The workflows directory contains JavaScript files that define workflows.
It may look like this:
.... workflows/ wordcount.js some-other-workflow.js yet-another-workflow.js ....
==== State Database
The state database directory contains the state of each slot as a small JSON file.
.... db/ state/ wordcount-lax/ 2015-09-15/ 00:00:00.000Z 01:00:00.000Z 02:00:00.000Z ... wordcount-nyc/ 2015-09-15/ 00:00:00.000Z 01:00:00.000Z 02:00:00.000Z ... ....
An individual slot file in the state database,
e.g. db/state/wordcount-lax/2015-09-15/01:00:00.000Z
, looks like
this:
.... { "status": "SUCCESS", "externalID": "0008681-150911205802478-oozie-oozi-W", "retryCount": 0 } ....
The status
field records the state the slot is in.
The externalID
field contains the Oozie ID of the corresponding
Oozie workflow execution if the slot is running, successful, or failed
(otherwise it's null).
The retryCount
records how many times the slot has already been
retried after failure.
==== Scheduler Step
On each scheduler step (typically triggered once per minute from
cron
), Celos evaluates all JavaScript files in the workflows
directory, yielding a set of uniquely identified workflows.
Then, for each workflow, Celos fetches all slot files within a sliding window of 7 days before the current date from the state database.
Each slot is a state machine with the following states:
image::etc/images/states.png[Slot states]
Celos takes the following action, depending on the state of the slot:
[options="header"] |=== |State|Action |WAITING|Call the workflow's trigger to determine whether the slot is ready. If the trigger signals readiness, put the slot into the READY state. If the slot has been waiting for too long, put the slot into the WAIT_TIMEOUT state. Otherwise, keep the slot in the WAITING state. |READY|Pass the slot as a candidate for scheduling to the workflow's scheduling strategy. If the strategy chooses to execute the slot, submit it to Oozie, and put it into the RUNNING state. Otherwise, keep the slot in the READY state. |RUNNING|Ask Oozie for the status of the execution. If the slot is still executing, keep it in the RUNNING state. If the slot has succeeded, put it into the SUCCESS state. If the slot has failed, but there are retries left, put the slot into the WAITING state again. If the slot has failed, and there are no more retries left, put the slot into the FAILURE state. |SUCCESS|Do nothing. |FAILURE|Do nothing. |WAIT_TIMEOUT|Do nothing. |KILLED|Do nothing. |===
The state database contains additional information about slots that
have been manually rerun with the <
In the following example, the slots 2015-08-01T01:00Z
and
2015-08-01T02:00Z
of the workflow wordcount-nyc
have been rerun.
They are outside the sliding window, so the above scheduling algorithm
would not look at the slots.
However, rerunning a slot touches an additional file in the rerun
subdirectory of the state database, and slots for which such a file
exists are fed into the scheduling algorithm in addition to the slots
from the 7 day sliding window.
.... db/ state/ ... as above ... rerun/ wordcount-nyc/ 2015-08-01/ 01:00:00.000Z 02:00:00.000Z ....
Rerunning thus serves two purposes: besides the main use of rerunning a slot, it can also be used to backfill data, by using it to mark slots outside the sliding window that the scheduler should care about.
==== Further Directories
Celos has a defaults directory that contains JavaScript files that
can be imported into a workflow JavaScript file with
<
Celos writes daily-rotating logs to a logs directory.
All directories (workflows, defaults, logs, and database) are
configurable via <
=== Continuous Delivery of Workflows
Changing a workflow definition in Celos is as simple as updating the workflow JavaScript file and/or the Oozie workflow definition in HDFS. On the next scheduler step, Celos will pick up the changes.
Bundled with Celos comes a tool called Celos CI (see <
For each group of related workflows, we have a GitHub repository and a Jenkins job that deploys the workflows on push to master using Celos CI. Celos CI copies the JavaScript files to the Celos host with SFTP, and uploads the Oozie workflow directory to HDFS.
image::etc/images/arch.png[Architecture]
=== Experience
As of September 2015, Celos has been in use at link:http://www.collective.com[Collective] for about two years, and is currently running all of our Hadoop processing (hundreds of individual workflows across dozens of projects).
Celos is productively used by people from different backgrounds, such as data science, operations, software engineering, and database administration, and has proven to be a welcome improvement on our previous Oozie coordinator-based scheduling.
We're proud that in two years of use, not a single bug in Celos has caused any downtime, which is attributable to the small codebase (about 2500 non-blank, non-comment lines of code for core Celos, as measured by link:http://cloc.sourceforge.net/[cloc] 1.56) and the rigorous test suite (hundreds of unit tests and an extensive integration test).
== Building & Running Celos
=== Prerequisites
You can probably get away with slightly older Hadoop and Oozie versions.
=== Building Celos
[source,shell] .... scripts/build.sh ....
This will build the following JARs:
=== Getting Started
Head over to link:samples/quickstart[samples/quickstart].
== Get in Touch
We'd love to help you try out and use Celos!
For now, please use the link:https://github.com/collectivemedia/celos/issues[Issue Tracker] if you have questions or comments.
=== People
Developers, developers, developers:
Head honcho: link:http://github.com/andry1[Chris Ingrassia]
== JavaScript API Reference
=== Workflows Reference
==== celos.defineWorkflow
===== Description
This is the main API call that registers a new workflow.
===== Syntax
[source,javascript] .... celos.defineWorkflow(options) ....
===== Parameters
The options
argument is an object with the following fields:
[options="header"]
|===
|Name|Type|Required|Description
|id
|String|Yes|The identifier string for the workflow, must be unique.
|trigger
|<<triggers-reference,Trigger>>|Yes|The trigger that determines data availability for the workflow.
|schedule
|<<schedules-reference,Schedule>>|Yes|The schedule that determines the points in time at which the workflow should run.
|schedulingStrategy
|<<scheduling-strategies-reference,SchedulingStrategy>>|Yes|The scheduling strategy that determines when and in which order ready slots should be run.
|externalService
|<<external-services-reference,ExternalService>>|Yes|The external service actually responsible for executing the job.
|startTime
|String (ISO 8601, UTC)|No|The date when the workflow should start executing (default: "1970-01-01T00:00Z").
|maxRetryCount
|Number|No|The number of times a slot of this workflow should be automatically retried if it fails (default: 0).
|waitTimeoutSeconds
|Number|No|The number of seconds a workflow should stay waiting until it times out (default: Integer.MAX_VALUE
(68 years)).
|===
===== Examples
[source,javascript] .... celos.defineWorkflow({ "id": "my-workflow", "schedule": celos.hourlySchedule(), "schedulingStrategy": celos.serialSchedulingStrategy(), "trigger": celos.alwaysTrigger(), "externalService": celos.oozieExternalService({ "oozie.wf.application.path": "/my-workflow/workflow.xml", "param1": "Hello", "param2": "World" }) }); ....
==== celos.importDefaults
===== Description
Evaluates a file from the defaults directory in the current scope, so all variables and functions from the file become available in the current file.
===== Syntax
[source,javascript] .... celos.importDefaults(name) ....
===== Parameters
[options="header"]
|===
|Name|Type|Required|Description
|name
|String|Yes|The name of the defaults file to import, without the ".js" suffix.
|===
===== Examples
[source,javascript] .... // Loads the file foo.js from the defaults directory celos.importDefaults("foo"); ....
=== Triggers Reference
A trigger determines (for each point in time at which a workflow runs) whether the preconditions for running the workflow (such as data availability, or success of upstream workflows are met).
==== celos.hdfsCheckTrigger
===== Description
Makes a workflow wait for a file or directory in HDFS. Often used to wait for _READY or _SUCCESS files.
===== Syntax
[source,javascript] .... celos.hdfsCheckTrigger(path, fs?) ....
===== Parameters
[options="header"]
|===
|Name|Type|Required|Description
|path
|String|Yes|The HDFS path to wait for. May include the variables ${year}
, ${month}
, ${day}
, ${hour}
, ${minute}
, and ${second}
, which will be replaced by the zero-padded values from the slot's scheduled time.
|fs
|String|No|The hdfs://
URI of the HDFS filesystem to use. If not specified, the value of the <
===== Examples
[source,javascript] .... celos.hdfsCheckTrigger("/logs/${year}-${month}-${day}/${hour}-00/_READY"); ....
==== celos.successTrigger
===== Description
Makes a workflow wait for the success of another workflow at the same time. This is used to define dependencies among workflows.
===== Syntax
[source,javascript] .... celos.successTrigger(workflowID) ....
===== Parameters
[options="header"]
|===
|Name|Type|Required|Description
|workflowID
|String|Yes|The ID of the other workflow to wait for.
|===
===== Examples
[source,javascript] .... // A workflow using this trigger will run at time T only after the // workflow "bar" has succeeded at time T. celos.successTrigger("bar"); ....
==== celos.andTrigger
===== Description
Logical AND of nested triggers.
===== Syntax
[source,javascript] .... celos.andTrigger(trigger1, ..., triggerN) ....
===== Parameters
[options="header"]
|===
|Name|Type|Required|Description
|trigger1, ..., triggerN
|<<triggers-reference,Trigger>>|No|The nested triggers. If no nested triggers are specified, the trigger is always ready.
|===
===== Examples
[source,javascript] .... // Wait for the HDFS paths /foo and /bar celos.andTrigger(celos.hdfsCheckTrigger("/foo"), celos.hdfsCheckTrigger("/bar")); ....
==== celos.orTrigger
===== Description
Logical OR of nested triggers.
===== Syntax
[source,javascript] .... celos.orTrigger(trigger1, ..., triggerN) ....
===== Parameters
[options="header"]
|===
|Name|Type|Required|Description
|trigger1, ..., triggerN
|<<triggers-reference,Trigger>>|No|The nested triggers. If no nested triggers are specified, the trigger is never ready.
|===
===== Examples
[source,javascript] .... // Wait for the HDFS paths /foo or /bar celos.orTrigger(celos.hdfsCheckTrigger("/foo"), celos.hdfsCheckTrigger("/bar")); ....
==== celos.notTrigger
===== Description
Logical NOT of a nested trigger.
===== Syntax
[source,javascript] .... celos.notTrigger(trigger) ....
===== Parameters
[options="header"]
|===
|Name|Type|Required|Description
|trigger
|<<triggers-reference,Trigger>>|Yes|The nested trigger to negate.
|===
===== Examples
[source,javascript] .... // Wait until HDFS path /foo doesn't exist. celos.notTrigger(celos.hdfsCheckTrigger("/foo")); ....
==== celos.offsetTrigger
===== Description
Offset a nested trigger into the future or past.
===== Syntax
[source,javascript] .... celos.offsetTrigger(seconds, trigger) ....
===== Parameters
[options="header"]
|===
|Name|Type|Required|Description
|seconds
|Number|Yes|The number of seconds to offset into the future (if positive) or past (if negative).
|trigger
|<<triggers-reference,Trigger>>|Yes|The nested trigger to offset.
|===
===== Examples
[source,javascript] .... // Wait for this hour's and next hour's HDFS file. var trigger = celos.hdfsCheckTrigger("/${year}-${month}-${day}/${hour}-00/_READY"); celos.andTrigger(trigger, celos.offsetTrigger(60 * 60, trigger); ....
==== celos.delayTrigger
===== Description
Waits until a specified amount of time has passed between the slot's scheduled time and the current wallclock time.
===== Syntax
[source,javascript] .... celos.delayTrigger(seconds) ....
===== Parameters
[options="header"]
|===
|Name|Type|Required|Description
|seconds
|Number|Yes|The number of seconds to wait.
|===
===== Examples
[source,javascript] .... // Will become ready one hour after its scheduled time. celos.delayTrigger(60 * 60);
// Can also be used for e.g. alerting: will trigger if, after 1 hour, // workflow "foo" is not successful. celos.andTrigger(celos.delayTrigger(60 * 60), celos.notTrigger(celos.successTrigger("foo"))); ....
==== celos.alwaysTrigger
===== Description
A trigger that's always ready, to be used when a workflow has no preconditions and should simply run at any scheduled time.
===== Syntax
[source,javascript] .... celos.alwaysTrigger() ....
===== Examples
[source,javascript] .... celos.alwaysTrigger(); ....
=== Schedules Reference
A schedule determines the points in time (slots) at which a workflow should run.
==== celos.cronSchedule
===== Description
A cron-like schedule.
The full cron syntax is described here: http://www.quartz-scheduler.org/documentation/quartz-1.x/tutorials/crontrigger
===== Syntax
[source,javascript] .... celos.cronSchedule(cronExpr) ....
===== Parameters
[options="header"]
|===
|Name|Type|Required|Description
|cronExpr
|String|Yes|The link:http://www.quartz-scheduler.org/documentation/quartz-1.x/tutorials/crontrigger[cron expression].
|===
===== Examples
[source,javascript] .... // Runs a workflow at 10:15am every day. celos.cronSchedule("0 15 10 ?"); ....
==== celos.hourlySchedule
===== Description
Runs a workflow every hour.
A shortcut for celos.cronSchedule("0 0 * * * ?")
.
===== Syntax
[source,javascript] .... celos.hourlySchedule() ....
===== Examples
[source,javascript] .... celos.hourlySchedule(); ....
==== celos.minutelySchedule
===== Description
Runs a workflow every minute.
A shortcut for celos.cronSchedule("0 * * * * ?")
.
===== Syntax
[source,javascript] .... celos.minutelySchedule() ....
===== Examples
[source,javascript] .... celos.minutelySchedule(); ....
==== celos.dependentSchedule
===== Description
Runs a workflow with the same schedule as another workflow.
===== Syntax
[source,javascript] .... celos.dependentSchedule(workflowID) ....
===== Parameters
[options="header"]
|===
|Name|Type|Required|Description
|workflowID
|String|Yes|The workflow ID of the other workflow.
|===
===== Examples
[source,javascript] .... // A workflow using this schedule will run with the same schedule as // the workflow with the ID "foo". celos.dependentSchedule("foo"); ....
=== Scheduling Strategies Reference
A scheduling strategy determines the order in which the ready slots of a workflow are executed.
==== celos.serialSchedulingStrategy
===== Description
Executes slots oldest first, with a configurable concurrency level.
===== Syntax
[source,javascript] .... celos.serialSchedulingStrategy(concurrency?) ....
===== Parameters
[options="header"]
|===
|Name|Type|Required|Description
|concurrency
|Number|No|The number of slots to execute at the same time (defaults to 1).
|===
===== Examples
[source,javascript] .... // A workflow using this scheduling strategy will have // at most three slots executing concurrently. celos.serialSchedulingStrategy(3); ....
=== External Services Reference
An external service actually executes a workflow.
==== celos.oozieExternalService
===== Description
Executes slots with Oozie.
===== Syntax
[source,javascript] .... celos.oozieExternalService(properties, oozieURL?) ....
===== Parameters
[options="header"]
|===
|Name|Type|Required|Description
|properties
|Object|Yes|Properties to pass to Oozie.
|oozieURL
|String|No|The HTTP URL of the Oozie API. If not specified, the value of the <
Inside property values, the variables ${year}
, ${month}
, ${day}
,
${hour}
, ${minute}
, and ${second}
, will be replaced by the
zero-padded values from the slot's scheduled time.
year
, month
, day
, hour
, minute
, and second
will also be
set as Oozie properties, so they can be used in the Oozie workflow XML
file.
Additionally, Celos will set the Oozie property celosWorkflowName
to
a string like "my-workflow@2015-09-12T20:00Z", useful for display.
oozie.wf.application.path
is the only property required by Oozie.
It points to a
link:https://oozie.apache.org/docs/4.2.0/WorkflowFunctionalSpec.html[Oozie
workflow XML file] within an Oozie workflow directory. There can be
multiple XML files within a single Oozie workflow directory.
If <
===== Examples
[source,javascript] .... celos.oozieExternalService({ "oozie.wf.application.path": "/workflow-dir/workflow.xml", "prop1": "Hello. It is the year ${year}!", "prop2": "Just another property." }); ....
=== Variables
If defined, these global variables influence some API functions. It's a good idea to set them in a common defaults file imported by all workflows.
==== CELOS_DEFAULT_HDFS
The String value of this variable will be used as the default HDFS
name node URI by <
==== CELOS_DEFAULT_OOZIE
The String value of this variable will be used as the default Oozie
API URL by <
==== CELOS_DEFAULT_OOZIE_PROPERTIES
The members of this Object will be added (before other properties, so
they can be overridden) to the Oozie properties of a workflow by
<
== Celos Server Reference
The celos-server.jar launches Celos.
The celos-server.jar must be run in the following way, due to the need to put the Hadoop configuration on the classpath:
[source,shell]
....
java -cp celos-server.jar:/etc/hadoop/conf com.collective.celos.server.Main
=== Server Command-Line Arguments
[options="header"]
|===
|Name|Type|Required|Description
|--port
|Integer|Yes|HTTP port for server.
|--workflows
|Path|No|Workflows directory (defaults to /etc/celos/workflows).
|--defaults
|Path|No|Defaults directory (defaults to /etc/celos/defaults).
|--logs
|Path|No|Logs directory (defaults to /var/log/celos).
|--db
|Path|No|State database directory (defaults to /var/lib/celos/db).
|--autoSchedule
|Integer|No|Interval (in seconds) between scheduler steps. If not supplied, Celos will not automatically step the scheduler, and wait for POSTs to the /scheduler servlet instead.
|===
=== Server HTTP API
==== /scheduler
Doing a POST to this servlet initiates a scheduler step.
In production we do this once a minute from cron
.
===== Example
[source,shell] .... curl -X POST localhost:1234/scheduler ....
==== /workflow-list
Doing a GET to this servlet returns the list of workflows loaded into Celos.
===== Example
[source,shell] .... curl "localhost:1234/workflow-list" ....
prints:
.... { "ids" : [ "workflow-1", "workflow-2", "workflow-3" ] } ....
==== /workflow-slots
Doing a GET to this servlet returns the slots of a workflow within a time range.
It also returns other information about the workflow, such as its
paused state (see the <
===== Parameters
[options="header"]
|===
|Name|Type|Required|Description
|id
|String|Yes|ID of the workflow.
|end
|String (ISO 8601, UTC)|No|Time (exclusive) of most recent slot to return. Defaults to current time.
|start
|String (ISO 8601, UTC)|No|Time (inclusive) of earliest slot to return. Defaults to 1 week before end
.
|===
===== Example
[source,shell] .... curl "localhost:1234/workflow-slots?id=workflow-1" ....
prints:
.... { "paused": false, "slots" : [ { "time" : "2015-09-13T13:50:00.000Z", "status" : "READY", "externalID" : null, "retryCount" : 0 }, { "time" : "2015-09-13T13:45:00.000Z", "status" : "SUCCESS", "externalID" : "0004806-150911205802478-oozie-oozi-W", "retryCount" : 0 }, { "time" : "2015-09-13T13:40:00.000Z", "status" : "SUCCESS", "externalID" : "0004804-150911205802478-oozie-oozi-W", "retryCount" : 0 }, ... ] } ....
==== /trigger-status
Doing a GET to this servlet returns human-readable information about why a slot is waiting.
===== Parameters
[options="header"]
|===
|Name|Type|Required|Description
|id
|String|Yes|ID of the workflow.
|time
|String (ISO 8601, UTC)|Yes|Scheduled time of slot to check.
|===
===== Example
[source,shell] .... curl "localhost:1234/trigger-status?id=workflow-1&time=2015-09-13T13:00Z" ....
prints:
.... { "type" : "AndTrigger", "ready" : false, "description" : "Not all nested triggers are ready", "subStatuses" : [ { "type" : "DelayTrigger", "ready" : false, "description" : "Delayed until 2015-09-14T16:00:00.000Z", "subStatuses" : [ ] }, { "type" : "HDFSCheckTrigger", "ready" : true, "description" : "HDFS path hdfs://nameservice1/logs/dc3/2015-09-14/1500 is ready", "subStatuses" : [ ] } ] } ....
==== /rerun
Doing a POST to this servlet instructs Celos to mark a slot for rerun.
The slot's state will be reset to waiting and its retry count will be reset to 0.
===== Parameters
[options="header"]
|===
|Name|Type|Required|Description
|id
|String|Yes|ID of the workflow.
|time
|String (ISO 8601, UTC)|Yes|Scheduled time of slot to rerun.
|===
===== Example
[source,shell] .... curl -X POST "localhost:1234/rerun?id=workflow-1&time=2015-09-13T13:40Z" ....
==== /kill
Doing a POST to this servlet marks a slot as killed and also kills its underlying Oozie job, if any.
===== Parameters
[options="header"]
|===
|Name|Type|Required|Description
|id
|String|Yes|ID of the workflow.
|time
|String (ISO 8601, UTC)|Yes|Scheduled time of slot to kill.
|===
===== Example
[source,shell] .... curl -X POST "localhost:1234/kill?id=workflow-1&time=2015-09-13T13:40Z" ....
==== /pause
Doing a POST to this servlet pauses or unpauses a workflow. While a workflow is paused, the scheduler will simply ignore it. This means it doesn't check any triggers for the workflow, doesn't submit new jobs to the workflow's external service, and doesn't perform any other action related to the workflow.
You can check whether a workflow is paused by looking at the paused
field of the result of the <
===== Parameters
[options="header"]
|===
|Name|Type|Required|Description
|id
|String|Yes|ID of the workflow.
|paused
|Boolean|Yes|Whether to pause (true
) or unpause (false
) the workflow.
|===
===== Example
[source,shell] ....
curl -X POST "localhost:1234/pause?id=workflow-1&paused=true"
curl -X POST "localhost:1234/pause?id=workflow-1&paused=false" ....
== Celos CI Reference
The celos-ci-fat.jar can be used to deploy workflow and defaults JavaScript files, as well as Oozie workflow directories automatically.
[source,shell]
....
java -jar celos-ci-fat.jar
=== CI Command-Line Arguments
[options="header"]
|===
|Name|Type|Required|Description
|--mode
|String|Yes|deploy
or undeploy
|--workflowName
|String|Yes|Name of workflow (or rather, project).
|--deployDir
|Path|Yes|The deployment directory (not needed for undeploy
).
|--target
|URL|Yes|The target file (file: or sftp: URL).
|--hdfsRoot
|Path|No|HDFS data will be placed under this root (defaults to /user/celos/app).
|===
=== Deployment Directory
A deployment directory must follow a canonical directory layout:
.... workflow.js defaults.js hdfs/ workflow.xml ... lib/ ... ....
workflow.js
is the JavaScript file containing workflow definitions.defaults.js
is a JavaScript defaults file containing variables and utility functions.hdfs
is the Oozie workflow directory.If WORKFLOW_NAME
is the value of the --workflowName
argument, and
HDFS_ROOT
is the value of the --hdfsRoot
argument, and
WORKFLOWS_DIR
and DEFAULTS_DIR
are the Celos workflows and
defaults directories specified in the target file, respectively, then
the files will be deployed to the following locations:
.... workflow.js -> $WORKFLOWS_DIR/$WORKFLOW_NAME.js defaults.js -> $DEFAULTS_DIR/$WORKFLOW_NAME.js hdfs/ -> $HDFS_ROOT/$WORKFLOW_NAME ....
=== Target File
A target file is a JSON file that describes a Celos and HDFS setup.
[options="header"]
|===
|Name|Type|Required|Description
|hadoop.hdfs-site.xml
|URL|Yes|URL of Hadoop hdfs-site.xml File
|hadoop.core-site.xml
|URL|Yes|URL of Hadoop core-site.xml File
|defaults.dir.uri
|URL|Yes|URL of Celos defaults directory.
|workflows.dir.uri
|URL|Yes|URL of Celos workflows directory.
|===
All fields must be file: or sftp: URLs.
Example:
[source,json] .... { "hadoop.hdfs-site.xml": "sftp://celos002.ewr004.collective-media.net/etc/hadoop/conf/hdfs-site.xml", "hadoop.core-site.xml": "sftp://celos002.ewr004.collective-media.net/etc/hadoop/conf/core-site.xml", "defaults.dir.uri": "sftp://celos002.ewr004.collective-media.net/etc/celos/defaults", "workflows.dir.uri": "sftp://celos002.ewr004.collective-media.net/etc/celos/workflows", } ....
The best practice for using Celos CI is putting a target file for each
Celos installation (e.g. production and staging) in a central,
SFTP-accessible location, and storing the target file's SFTP URL in an
environment variable (e.g. PRODUCTION_TARGET
and STAGING_TARGET
).
Deploy scripts using Celos CI should then pass this variable as the
--target
argument to Celos CI, making them independent of the Celos
installation to which the workflow is to be deployed. See
link:samples/quickstart[] for an example.
=== CI Environment Variables
==== CELOS_CI_USERNAME
If defined, overrides the username in sftp: URLs in the target file.
== Celos UI Reference
The celos-ui.jar runs the Celos user interface.
[source,shell]
....
java -jar celos-ui.jar
=== UI Command-Line Arguments
[options="header"]
|===
|Name|Type|Required|Description
|--port
|Integer|Yes|HTTP port for UI.
|--celos
|URL|Yes|Celos URL.
|--hue
|URL|No|Hue (Oozie UI) URL. If specified, slots become linked to Hue.
|--config
|Path|No|JSON <
=== UI Config File
If the --config
argument is not supplied to the UI, it simply
renders one long list of all loaded workflows.
By supplying a JSON file with the following format to --config
, the
workflows can be hierarchically grouped (one level):
[source,javascript] .... { "groups": [ { "name": "Group 1", "workflows": [ "workflow-1", "workflow-2" ] }, { "name": "Group 2", "workflows": [ "workflow-3" ] } ] } ....
=== UI HTTP API
==== /ui
Doing a GET to this servlet displays the Celos UI.
===== Parameters
[options="header"]
|===
|Name|Type|Required|Description
|time
|String (ISO 8601, UTC)|No|Time of most recent slot to display (defaults to current time).
|zoom
|String (Number)|No|Zoom level in minutes (defaults to 60).
|===
== Miscellaneous
=== Related Tools
Two similar, programmable schedulers:
link:https://github.com/airbnb/airflow[Airbnb Airflow]
link:https://github.com/spotify/luigi[Spotify Luigi]
=== The Celos Name
The link:http://www.quicksilver899.com/Tolkien/LOTR/LOTR_AC.html[_Lord of the Rings Dictionary_] defines it as:
.... Celos S; also Kelos; freshet; kel- flow away [Sil; *kelu-]; one would want to choose los snow [Sil] for the final element, but the text of Unfinished Tales, Index, entry Celos states the final form derives from Q -sse, -ssa, a form of emphasis [some say locative], making the definition 'much flowing' or 'freshet', often resulting from melting snow; perhaps 'snow' is then implied from the ending; a river in Gondor ....
Alternatively, the link:http://programmingisterrible.com/post/65781074112/devils-dictionary-of-programming[_Devil’s Dictionary of Programming_] defines it as:
.... Configurable: It’s your job to make it usable. Elegant: The only use case is making me feel smart. Lightweight: I don’t understand the use-cases the alternatives solve. Opinionated: I don’t believe that your use case exists. Simple: It solves my use case. ....
=== Acknowledgements
Thanks to our in-house users and to the developers of the many fine open source libraries we're able to use, including but not limited to Oozie, Hadoop, Jetty, Rhino, Joda, Jackson, Quartz, and Gradle.