GoogleCloudPlatform / DataflowTemplates

Cloud Dataflow Google-provided templates for solving in-Cloud data tasks
https://cloud.google.com/dataflow/docs/guides/templates/provided-templates
Apache License 2.0
1.16k stars 978 forks source link

Allowing Jdbc FetchSize Configuration for handling large rows #2028

Open VardhanThigle opened 4 days ago

VardhanThigle commented 4 days ago

Allowing Jdbc FetchSize Configuration for handling large rows.

Overview

Default fetch size used by JdbcIO is 50_000 rows. Large (in terms of memory size) rows can lead to memory errors and JdbcIO recommends tuning fetch size in case of memory errors. Please see here. Here we allow the user to tune the fetch size via parameters. Auto inference of fetchsize will be taken as a separate task as it needs careful scale testing.

codecov[bot] commented 4 days ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 52.97%. Comparing base (94a7e9f) to head (7d3a556).

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #2028 +/- ## ============================================ + Coverage 45.42% 52.97% +7.55% + Complexity 3678 1371 -2307 ============================================ Files 842 378 -464 Lines 49970 20680 -29290 Branches 5261 2092 -3169 ============================================ - Hits 22697 10955 -11742 + Misses 25605 9045 -16560 + Partials 1668 680 -988 ``` | [Components](https://app.codecov.io/gh/GoogleCloudPlatform/DataflowTemplates/pull/2028/components?src=pr&el=components&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=GoogleCloudPlatform) | Coverage Δ | | |---|---|---| | [spanner-templates](https://app.codecov.io/gh/GoogleCloudPlatform/DataflowTemplates/pull/2028/components?src=pr&el=component&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=GoogleCloudPlatform) | `67.99% <100.00%> (+1.26%)` | :arrow_up: | | [spanner-import-export](https://app.codecov.io/gh/GoogleCloudPlatform/DataflowTemplates/pull/2028/components?src=pr&el=component&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=GoogleCloudPlatform) | `∅ <ø> (∅)` | | | [spanner-live-forward-migration](https://app.codecov.io/gh/GoogleCloudPlatform/DataflowTemplates/pull/2028/components?src=pr&el=component&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=GoogleCloudPlatform) | `75.88% <ø> (ø)` | | | [spanner-live-reverse-replication](https://app.codecov.io/gh/GoogleCloudPlatform/DataflowTemplates/pull/2028/components?src=pr&el=component&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=GoogleCloudPlatform) | `76.65% <ø> (ø)` | | | [spanner-bulk-migration](https://app.codecov.io/gh/GoogleCloudPlatform/DataflowTemplates/pull/2028/components?src=pr&el=component&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=GoogleCloudPlatform) | `86.41% <100.00%> (+0.04%)` | :arrow_up: | | [Files with missing lines](https://app.codecov.io/gh/GoogleCloudPlatform/DataflowTemplates/pull/2028?dropdown=coverage&src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=GoogleCloudPlatform) | Coverage Δ | | |---|---|---| | [...ud/teleport/v2/options/OptionsToConfigBuilder.java](https://app.codecov.io/gh/GoogleCloudPlatform/DataflowTemplates/pull/2028?src=pr&el=tree&filepath=v2%2Fsourcedb-to-spanner%2Fsrc%2Fmain%2Fjava%2Fcom%2Fgoogle%2Fcloud%2Fteleport%2Fv2%2Foptions%2FOptionsToConfigBuilder.java&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=GoogleCloudPlatform#diff-djIvc291cmNlZGItdG8tc3Bhbm5lci9zcmMvbWFpbi9qYXZhL2NvbS9nb29nbGUvY2xvdWQvdGVsZXBvcnQvdjIvb3B0aW9ucy9PcHRpb25zVG9Db25maWdCdWlsZGVyLmphdmE=) | `94.62% <100.00%> (+0.11%)` | :arrow_up: | | [...e/reader/auth/dbauth/LocalCredentialsProvider.java](https://app.codecov.io/gh/GoogleCloudPlatform/DataflowTemplates/pull/2028?src=pr&el=tree&filepath=v2%2Fsourcedb-to-spanner%2Fsrc%2Fmain%2Fjava%2Fcom%2Fgoogle%2Fcloud%2Fteleport%2Fv2%2Fsource%2Freader%2Fauth%2Fdbauth%2FLocalCredentialsProvider.java&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=GoogleCloudPlatform#diff-djIvc291cmNlZGItdG8tc3Bhbm5lci9zcmMvbWFpbi9qYXZhL2NvbS9nb29nbGUvY2xvdWQvdGVsZXBvcnQvdjIvc291cmNlL3JlYWRlci9hdXRoL2RiYXV0aC9Mb2NhbENyZWRlbnRpYWxzUHJvdmlkZXIuamF2YQ==) | `100.00% <100.00%> (ø)` | | | [...source/reader/io/jdbc/iowrapper/JdbcIoWrapper.java](https://app.codecov.io/gh/GoogleCloudPlatform/DataflowTemplates/pull/2028?src=pr&el=tree&filepath=v2%2Fsourcedb-to-spanner%2Fsrc%2Fmain%2Fjava%2Fcom%2Fgoogle%2Fcloud%2Fteleport%2Fv2%2Fsource%2Freader%2Fio%2Fjdbc%2Fiowrapper%2FJdbcIoWrapper.java&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=GoogleCloudPlatform#diff-djIvc291cmNlZGItdG8tc3Bhbm5lci9zcmMvbWFpbi9qYXZhL2NvbS9nb29nbGUvY2xvdWQvdGVsZXBvcnQvdjIvc291cmNlL3JlYWRlci9pby9qZGJjL2lvd3JhcHBlci9KZGJjSW9XcmFwcGVyLmphdmE=) | `93.92% <100.00%> (+0.17%)` | :arrow_up: | | [...splitter/transforms/ReadWithUniformPartitions.java](https://app.codecov.io/gh/GoogleCloudPlatform/DataflowTemplates/pull/2028?src=pr&el=tree&filepath=v2%2Fsourcedb-to-spanner%2Fsrc%2Fmain%2Fjava%2Fcom%2Fgoogle%2Fcloud%2Fteleport%2Fv2%2Fsource%2Freader%2Fio%2Fjdbc%2Funiformsplitter%2Ftransforms%2FReadWithUniformPartitions.java&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=GoogleCloudPlatform#diff-djIvc291cmNlZGItdG8tc3Bhbm5lci9zcmMvbWFpbi9qYXZhL2NvbS9nb29nbGUvY2xvdWQvdGVsZXBvcnQvdjIvc291cmNlL3JlYWRlci9pby9qZGJjL3VuaWZvcm1zcGxpdHRlci90cmFuc2Zvcm1zL1JlYWRXaXRoVW5pZm9ybVBhcnRpdGlvbnMuamF2YQ==) | `98.44% <100.00%> (+0.08%)` | :arrow_up: | | [...loud/teleport/v2/templates/PipelineController.java](https://app.codecov.io/gh/GoogleCloudPlatform/DataflowTemplates/pull/2028?src=pr&el=tree&filepath=v2%2Fsourcedb-to-spanner%2Fsrc%2Fmain%2Fjava%2Fcom%2Fgoogle%2Fcloud%2Fteleport%2Fv2%2Ftemplates%2FPipelineController.java&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=GoogleCloudPlatform#diff-djIvc291cmNlZGItdG8tc3Bhbm5lci9zcmMvbWFpbi9qYXZhL2NvbS9nb29nbGUvY2xvdWQvdGVsZXBvcnQvdjIvdGVtcGxhdGVzL1BpcGVsaW5lQ29udHJvbGxlci5qYXZh) | `33.91% <100.00%> (+0.57%)` | :arrow_up: | ... and [481 files with indirect coverage changes](https://app.codecov.io/gh/GoogleCloudPlatform/DataflowTemplates/pull/2028/indirect-changes?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=GoogleCloudPlatform)

🚨 Try these New Features: