When running the jdbc source connector in bulk mode (mode=bulk) the JdbcSourceTask fetches the whole table in one go SELECT * FROM {table}. poll.interval.ms determines the speed by which the records are sent to the target topic.
Once all records a committed to the topic the next SELECT query is sent to the data base system. There is no other option to reduce the frequency of requests being send to the DB than increasing poll.interval.ms which slows down the process of committing the messages to the target topic.
This leads to a paradox situation if you want to poll a table only once day for example.
Solution
With this pull request is suggest adding a poll.sleep.ms setting that allows the JdbcSourceTask to sleep after the whole table result set has been consumed.
This setting can be used to limit the frequency at which the SQL server is being queried
without limiting the processing speed of already obtained result sets.
Does this solution apply anywhere else?
[ ] yes
[x] no
If yes, where?
Test Strategy
Test implemented : io.confluent.connect.jdbc.source.JdbcSourceTaskUpdateTest.testBulkPeriodicLoadWithPollSleep()
Integration tests implemented in other branch, which are using the Filemaker dialect for which this fork has been created:
Problem
When running the jdbc source connector in bulk mode (
mode=bulk
) theJdbcSourceTask
fetches the whole table in one goSELECT * FROM {table}
.poll.interval.ms
determines the speed by which the records are sent to the target topic. Once all records a committed to the topic the next SELECT query is sent to the data base system. There is no other option to reduce the frequency of requests being send to the DB than increasingpoll.interval.ms
which slows down the process of committing the messages to the target topic.This leads to a paradox situation if you want to poll a table only once day for example.
Solution
With this pull request is suggest adding a
poll.sleep.ms
setting that allows theJdbcSourceTask
to sleep after the whole table result set has been consumed.This setting can be used to limit the frequency at which the SQL server is being queried without limiting the processing speed of already obtained result sets.
Does this solution apply anywhere else?
If yes, where?
Test Strategy
Test implemented :
io.confluent.connect.jdbc.source.JdbcSourceTaskUpdateTest.testBulkPeriodicLoadWithPollSleep()
Integration tests implemented in other branch, which are using the Filemaker dialect for which this fork has been created:
Manual tests done in so far, as this feature is being used in production at us.
Testing done:
Release Plan