apache / seatunnel

SeaTunnel is a next-generation super high-performance, distributed, massive data integration tool.
https://seatunnel.apache.org/
Apache License 2.0
8.06k stars 1.83k forks source link

[Feature][ALL] Global add grok expression support #8103

Open wuchunfu opened 13 hours ago

wuchunfu commented 13 hours ago

Search before asking

Description

For example, if you want to filter files that meet the conditions under a certain date path, although you can use regular expressions to achieve this, it is relatively cumbersome.

Example file path

/data/input_data/beijing/20241111/103_20241110141500_000.csv
/data/input_data/beijing/20241112/103_20241120141500_000.csv

Input the path grok expression, such as

/data/input_data/%{NOTSPACE:area}/%{YEAR:folderYear}%{MONTHNUM:folderMonth}%{MONTHDAY:folderDay}/%{BASE16NUM:prefix}_%{YEAR:year}%{MONTHNUM:month}%{MONTHDAY:day}%{HOUR:hour}%{MINUTE:minute}%{SECOND:second}_%{BASE16NUM:suffix}.csv

The output path can use the variables in the input path, such as

/data/output_data/${area}/${folderYear}${folderMonth}${folderDay}/${prefix}_${year}${month}${day}${hour}${minute}${second}_${suffix}.csv

Usage Scenario

No response

Related issues

No response

Are you willing to submit a PR?

Code of Conduct