harbby / sylph

Stream computing platform for bigdata
https://harbby.github.io/project/sylph/index.html
Apache License 2.0
404 stars 173 forks source link

StreamSql adds eventtime and watermark support #1

Closed harbby closed 6 years ago

harbby commented 6 years ago

Looks like this:

create source table (
time2 bigint,
...
) with(
....
)
WATERMARK server_time FOR eventTime BY ROWMAX_OFFSET(5000);
harbby commented 6 years ago

The progress is relatively smooth, see the example below:

create source table topic1(
    key varchar,
    value varchar,
    event_time bigint
) with (
    type = 'ideal.sylph.plugins.flink.source.TestSource'
);

-- 定义数据流输出位置
create sink table print_table_sink(
    key varchar,
    cnt long,
    window_time varchar
) with (
    type = 'ideal.sylph.plugins.flink.sink.PrintSink',   -- print console
    other = 'demo001'
);

-- 定义 WATERMARK,通常您应该从kafka message中解析出event_time字段
create view TABLE foo
WATERMARK event_time FOR rowtime BY ROWMAX_OFFSET(5000)  --event_time 为您的真实数据产生时间
AS 
with tb1 as (select * from topic1)  --通常这里解析kafka message
select * from tb1;

-- 描述数据流计算过程
insert into print_table_sink
select key,
count(1),
cast(TUMBLE_START(rowtime,INTERVAL '5' SECOND) as varchar)|| '-->' 
|| cast(TUMBLE_END(rowtime,INTERVAL '5' SECOND) as varchar) AS window_time
from foo where key is not null
group by key,TUMBLE(rowtime,INTERVAL '5' SECOND)
harbby commented 6 years ago

How do you have any ideas for suggestions? Welcome to the discussion

harbby commented 6 years ago

Already supported

harbby commented 5 years ago

The proctime demo:

create source table topic1(
    key varchar,
    message varchar,     -- json
    event_time bigint,
    proctime as proctime()
) with (
    type = 'test'
);