this is a small flink job (and some documentation on how to run a flink job) that fixes #45 (or is the beginning of trying to). with #51 @kdiluca removed the temporary thing we had going with redis to do the windowing of a given vehicles trace. that job is now taken on by flink. for now the idea is that flink jobs when their window functions fire, will make http requests to the reporter (because we dont have jni bindings into traffic segment matcher). this makes the scaling issues mentioned in #45 even more complex as we'll need to scale reporter farms both in terms of the flink job and the segment matcher pieces.
in the future we may want to look at getting a jni interface into segment matcher and accessing it directly from flink, but there are several unknowns such as how do you get flink access to data in a cloud system such as aws, etc. for now we'll punt on this issues and use http from flink to a traffic segment matcher cluster.
the flink job has two modes. the first is read from file, which is handy for testing and the second is to read from a kafka stream.
note that the format of the messages (or lines from a file) the flink job expects is hardcoded to:
date_string|id|X|X|X|accuracy|X|X|X|lat|lon|...
theres still a bunch of work to do to tie this in and some issues so i'll go off and write those now.
this is a small flink job (and some documentation on how to run a flink job) that fixes #45 (or is the beginning of trying to). with #51 @kdiluca removed the temporary thing we had going with redis to do the windowing of a given vehicles trace. that job is now taken on by flink. for now the idea is that flink jobs when their window functions fire, will make http requests to the reporter (because we dont have jni bindings into traffic segment matcher). this makes the scaling issues mentioned in #45 even more complex as we'll need to scale reporter farms both in terms of the flink job and the segment matcher pieces.
in the future we may want to look at getting a jni interface into segment matcher and accessing it directly from flink, but there are several unknowns such as how do you get flink access to data in a cloud system such as aws, etc. for now we'll punt on this issues and use http from flink to a traffic segment matcher cluster.
the flink job has two modes. the first is read from file, which is handy for testing and the second is to read from a kafka stream.
note that the format of the messages (or lines from a file) the flink job expects is hardcoded to:
theres still a bunch of work to do to tie this in and some issues so i'll go off and write those now.