Tencent / TBase

TBase is an enterprise-level distributed HTAP database. Through a single database cluster to provide users with highly consistent distributed database services and high-performance data warehouse services, a set of integrated enterprise-level solutions is formed.
Other
1.38k stars 262 forks source link

MVCC not working? - timestamp `A` is too old to execute, recentCommitTs `B` #39

Open yazun opened 4 years ago

yazun commented 4 years ago

We experience basic MVCC problem. It never appeared on PG-XL:

XX000: node:datanode9, backend_pid:7215, nodename:datanode7,backend_pid:38951,message:start timestamp 432575393455 is too old to execute, recentCommitTs 432678004078, 

Query is read only, but there is some insert/update activity from massive number of clients (500+) executing the same query with different parameters on other tables.

 select t.*  from ts t join catalog_source cs on (t.sourceid = cs.fdatalight_sourceid and t.catalogid =             cs.fdatalight_catalogid)   where cs.catalog_catalogid = $1 and cs.fdatalight_sourceid between $2 and $3   and catalogid = getowningcatalogid($4) and sourceid  between $5 and $6

This is a blocker for us as touches fundamentals of the DB. Hoping somebody can support us here. No GTM proxies, GTM is source of transaction IDs.

yazun commented 4 years ago

This error happens also for DML operations. Could anybody comment? Is TBase dependent on the clock synchronisation?

yazun commented 4 years ago

Poking into the code we decided to increase vaccum_delta from default 100 to 10K and it solved the problem for now (we are blocked by #40 ). Logic seems to be scary though as it relies on arbitrary time range. https://github.com/Tencent/TBase/blob/3295393cbabd6f17676c53078cf4cc03aa9dc1fd/src/backend/storage/ipc/procarray.c#L2514 and below.

JennyJennyChen commented 3 years ago

When a SQL execution time is too long and exceeds vacuum_delta, the timeout SQL will be killed when the vacuum process starts, and the above error will be reported. This problem can be solved by increasing vacuum_delta

ause15 commented 3 years ago

pg cluster when you rollback one machine of the cluster, you should rollback the rest ones

yazun commented 3 years ago

pg cluster when you rollback one machine of the cluster, you should rollback the rest ones @ause15 is not rollback automatic on all the nodes?