pingcap / tiflow

This repo maintains DM (a data migration platform) and TiCDC (change data capture for TiDB)
Apache License 2.0
430 stars 286 forks source link

Changefeed cannot be updated after removing filter from configuration file #11218

Open snowballbear opened 6 months ago

snowballbear commented 6 months ago

What did you do?

config file: $cat cdctest.toml

case-sensitive = true
enable-old-value = true
[filter]
rules = ['yxt.ote_exam_test']
[sink]
dispatchers = [
{matcher = ['yxt.ote_exam_test'], dispatcher = "ts"}
]
protocol = ""
[mounter]
worker-num = 2
[[filter.event-filters]]
matcher = ['yxt.ote*']
ignore-delete-value-expr = "org_id in ('xx')"
ignore-insert-value-expr = "org_id in ('xx')"
ignore-update-old-value-expr = "org_id in ('xx')"
ignore-update-new-value-expr = "org_id in ('xx')"

STEP 1: pause changefeed tiup cdc cli changefeed pause --server=http://xxx:xx/ --changefeed-id cdctest

STEP 2: modify configuration file, remove filter.event-filters $cat cdctest.toml

case-sensitive = true
enable-old-value = true
[filter]
rules = ['yxt.ote_exam_test']
[sink]
dispatchers = [
{matcher = ['yxt.ote_exam_test'], dispatcher = "ts"}
]
protocol = ""
[mounter]
worker-num = 2

STEP 3: update changefeed tiup cdc cli changefeed update --server=http://xx.xx/ --sink-uri="mysql://xxx:3309/?max-txn-row=5000&transaction-atomicity=none&time-zone=" --changefeed-id="cdctest" --config=cdctest.toml

STEP 4: resume changefeed tiup cdc cli changefeed query --server=http://xx.xx/ --changefeed-id cdctest

What did you expect to see?

changefeed can be updated successfully.

What did you see instead?

STEP 3: tiup cdc cli changefeed update --server=http://xx.xx/ --sink-uri="mysql://xxx:3309/?max-txn-row=5000&transaction-atomicity=none&time-zone=" --changefeed-id="cdctest" --config=cdctest.toml changefeed config is the same with the old one, do nothing

the modification has not taken effect and the filter still exists

Versions of the cluster

Upstream TiDB cluster version (execute SELECT tidb_version(); in a MySQL client):

v6.5.4

``

TiCDC version (execute cdc version):

v6.5.4
hicqu commented 5 months ago

I have reproduced this problem. Seems it's a bug.

fubinzh commented 5 months ago

/severity major

asddongmen commented 5 months ago

I will investigate it.

asddongmen commented 5 months ago

TiCDC can only update configurations that are explicitly set in the new config file. If the configuration you want to update is not specified, it will not make any changes. Unfortunately, there's no straightforward way to improve this.

Here's an example of how to remove the event-filter:

case-sensitive = true
enable-old-value = true
[filter]
rules = ['yxt.ote_exam_test']
[sink]
dispatchers = [
{matcher = ['yxt.ote_exam_test'], dispatcher = "ts"}
]
[mounter]
worker-num = 2
[[filter.event-filters]]
matcher = []
ignore-delete-value-expr = ""
ignore-insert-value-expr = ""
ignore-update-old-value-expr = ""
ignore-update-new-value-expr = ""
asddongmen commented 5 months ago

I think it's not a bug.

The current problem is that CDC, when updating the changefeed's config based on the new config file, will only attempt to update those explicitly specified configuration items. So, it fails when the user tries to overwrite the old values with empty values.

Perhaps we can consider adding a Boolean type flag 'overwrite' to the cli update command. If the user specifies this flag, the old config will be replaced directly with the contents of the new config file.

asddongmen commented 5 months ago

@BenMeadowcroft what do you think?

benmeadowcroft commented 5 months ago

Perhaps we can consider adding a Boolean type flag 'overwrite' to the cli update command. If the user specifies this flag, the old config will be replaced directly with the contents of the new config file.

@benmeadowcroft what do you think?

Yes, I agree that approach makes sense, similar to the difference between PUT and PATCH methods in a RESTful API. Having the default be a patch, and the only option, can be concerning because the resulting configuration state will depend on multiple items including:

  1. The currently configured state
  2. The new configuration file (may be partially defined)
  3. The default configuration values for the current version of TiCDC (may be different from the defaults when the changefeed was first configured if subsequent updates have occurred)

Enabling the user to apply a new configuration for a changefeed with just the new config and the applied defaults (for that version) is one less item for them to consider.

Based on the comments above is it the case that not including the event filter lines in the config will not impact the currently configured event filters? And that if the event filter lines are included, but set to empty values that will successfully clear the event filter?

seiya-annie commented 2 months ago

/report community