BitSail is a distributed high-performance data integration engine which supports batch, streaming and incremental scenarios. BitSail is widely used to synchronize hundreds of trillions of data every day.
HDFS Inputformat supports passing in path parameters and imports all files under the path. However, in incremental scenarios, it is necessary to filter based on file name or file creation time
In the implementation, you can customize the PathFilter strategy, filter according to the file name regular match or file creation time, and set it through the setInputPathFilter function
Description
HDFS Inputformat supports passing in path parameters and imports all files under the path. However, in incremental scenarios, it is necessary to filter based on file name or file creation time
In the implementation, you can customize the PathFilter strategy, filter according to the file name regular match or file creation time, and set it through the setInputPathFilter function
BitSail Component or Code Module
BitSail Connector
Are you willing to submit PR?
Code of Conduct