Closed glorv closed 3 years ago
/run-all-tests
/run-all-tests
LGTM
what bug does this PR fix?
We met a case last week that the source file columns are more than the table schema, and this caused tidb backend panic at https://github.com/pingcap/tidb-lightning/blob/c2b6dde7200228f743bceec1724c3a0ad646992b/lightning/backend/tidb.go#L251.
And for local/importer backend, whether it can result in correct result depends on the implementation of tidb table encoder, thus it is undefined behavior.
To close this issue: #518
Not really. #531 should fix one cause of the tikv timeout
in checksum/analyze phase. There is no corresponding issue for this one. We found it in an oncall issue though the root cause of that issue is user misconfiged the table schema.
We met a case last week that the source file columns are more than the table schema, and this caused tidb backend panic at https://github.com/pingcap/tidb-lightning/blob/c2b6dde7200228f743bceec1724c3a0ad646992b/lightning/backend/tidb.go#L251.
maybe easier to check len(enc.columnIdx) >= len(row)
in this file and also enc.columnIdx[i] >= 0
.
or maybe do it inside initializeColumns
and throw the unknownCols
error.
And for local/importer backend, whether it can result in correct result depends on the implementation of tidb table encoder
pretty sure they will just be ignored.
we also need to support "ignore_extra_columns" in Fast Bulk Import i.e. it is legal to have a CSV in the form
name,info,ignored,address
Foo,Foo2,X,1234567
Bar,Bar2,Y,987654321
with the column "ignored" skipped.
ok, we could delay resolve this issue to the implementation of "ignore_extra_columns" since this is an ergent thing.
With a CSV without header, or it's an invalid csv with the fields in some line more than in header, maybe we still should check it in the tidb encoder
i mean we should still check it, but we should do it initializeColumns
or the TiDB Encoder, not in the encodeLoop
.
i mean we should still check it, but we should do it
initializeColumns
or the TiDB Encoder, not in theencodeLoop
.
got. I'll move it into the tidb encoder since tableKvEncoder can properly handle it.
@kennytm I have moved the check into tidb encoder, and fix an extra bug related to _tidb_rowid
, PTAL again
@lance6716 PTAL again, since it has changed a log.
/run-all-tests
[2020-12-25T03:56:45.752Z] [mysql] 2020/12/25 11:56:45 packets.go:36: unexpected EOF
[2020-12-25T03:56:45.752Z] [mysql] 2020/12/25 11:56:45 connection.go:135: write tcp 127.0.0.1:27016->127.0.0.1:4000: write: broken pipe
[2020-12-25T03:56:48.274Z] [mysql] 2020/12/25 11:56:48 packets.go:122: closing bad idle connection: EOF
[2020-12-25T03:56:48.275Z] [mysql] 2020/12/25 11:56:48 connection.go:135: write tcp 127.0.0.1:27018->127.0.0.1:4000: write: broken pipe
[2020-12-25T03:56:48.275Z] [mysql] 2020/12/25 11:56:48 packets.go:122: closing bad idle connection: EOF
[2020-12-25T03:56:48.275Z] [mysql] 2020/12/25 11:56:48 connection.go:135: write tcp 127.0.0.1:27036->127.0.0.1:4000: write: broken pipe
[2020-12-25T03:56:48.275Z] Error: read checkpoint failed: begin transaction failed: dial tcp 127.0.0.1:4000: connect: connection refused
Still need to resolve the integration test tidb_rowid
first.
(LGTM, bot has already count my lgtm)
@kennytm PTAL again
/lgtm
What problem does this PR solve?
~Check row field count before encodes, if row value count is bigger than table field count, directly return an error.~
_tidb_rowid
ingetColumnNames
and tidb encoderWhat is changed and how it works?
Check List
Tests
Side effects
Related changes
Release Note