nikepan / clickhouse-bulk

Collects many small inserts to ClickHouse and send in big inserts
Apache License 2.0
474 stars 86 forks source link

Ethernal cycle of requests with errors #24

Open nizovtsevnv opened 4 years ago

nizovtsevnv commented 4 years ago

For example, if I send INSERT request with wrong date format "2020-02-06 07:15:23.364727" (ruby Time format), then this request will never been executed and makes computation power leak.

Could you make a solution for this?

For example: if exception is not about clickhouse server connection then remove this bad request from a sending cycle to a separate dump file with list of bad requests.

Des1re7 commented 4 years ago

Hello! I faced similar issue and wrote a workaroud for 400 error. What error codes do you have that it makes sense to put in a separate folder?

nikepan commented 4 years ago

good idea! i remember it and try to implement in future. thanks!

splichy commented 3 years ago

If it only one row from many you can workaround with https://clickhouse.tech/docs/en/operations/settings/settings/#settings-input_format_allow_errors_num But it will hide basically all the insert errors. Anyway, in our case, we had multiline inserts where one or two out of ~200 rows had e.g. negative integer(interger overflow) while CH schema was UInt32 - so while we were waiting for fix in app we have temporarily enabled input_format_allow_errors to avoid dumping inserts to disk and trying to insert them over and over.