darach / eep-js

eep.js - Embedded Event Processing
MIT License
217 stars 33 forks source link

CEP Language #10

Open kkwekkeboom opened 11 years ago

kkwekkeboom commented 11 years ago

I think to make this useful, a declarative language and parser is needed. (e.g. compare Esper). Anyway, this is the proper way to start to make such a framework generic. Of course the non-trivial thing is to make stream joins efficient.

darach commented 11 years ago

EEP currently focuses on window operations with aggregate functions. Branching and combinators (such as join) will likely be a separate project. I've investigated a DSL based on Node.js streams and pipes (beam-js repository) and this provides branch, union with filter and transformation functions. I've also looked into obvious omissions in EEP such as group by and order by - but I disliked the results. I found it easier to arrive at nice abstractions in OO languages and harder in FP languages. Grouping and ordering will be added 'soon'.

As it stands EEP (& beam-js) are useful to me. What I wanted was simple, lightweight, embeddable. If you are looking for a full blown CEP engine, with language, parser, tools and IDE integration then there are many good open source (eg: Esper) and commercial (StreamBase, Apama, Oracle CEP) solutions out there. Arguably one of the Rx implementations would suffice for many algorithms. RxJs and Netflix's Rx for Java are both great projects. I use them myself.

With EEP I wanted something small with a simple model. Something I can run on a server, or run on an embedded chip such as the AVR or an Arduino or mBed. Something easy to port, and easy to fork.

About 80% of the features of a modern CEP engine are used rarely. The more exotic the more baggage and the more varied they are across CEP engines. About 20% is generally useful.

kkwekkeboom commented 11 years ago

I will check the BeamJS project. Unfortunately my connection with http://doc.beamjs.org/ gets interrupted a.t.m (Error 101).

Having a simple and lightweight thing like EEP already can help many programmers.The only thing is that a DSL may help thinking in an abstract and consequent way (one has to think of a 'rich' language already). On the other side are modern CEP engines with many unused features.

Regarding the simple model: to make it suitable for embedded usage (e.g. beaglebone) memory consumption may be an issue. A few months ago I did a thing similar with an average calculation over a fixed (but moving) window, but I did not include time explicitly. On having an average (3600 numbers) a callback was fired.

That part is straightforward to implement, but the trouble comes when having multiple streams at once. I like your idea of having composite functions to perform calculations efficiently (How much better/ worse than plain object calculations?) . However storing numbers consumes a serious amount of memory (e.g. for 24 hours, sample rate: 1 number / second: 3600*24 = 84000 numbers). So I decided to optimize this thing by having one function that calculates the 1 minute average and another one which calculates the 1 hour average and one which calculates the 24 hour average. Then I only need to store 144 numbers.

darach commented 11 years ago

Oh, I didn't realize there was a beamjs project and domain, I meant: https://github.com/darach/beam-js. On embedded usage memory will be an issue yes, but memory is always an issue on embedded devices. I might see what makes sense on an AVR or ATTiny, for example, as I use those regularly. Also, you don't need to store the numbers in memory, you could dump/log events and/or use a persistent window backed by flash storage, for example.

darach commented 10 years ago

Update. Ordering is now a 'solved problem' in EEP windows.