Tencent / TBase

TBase is an enterprise-level distributed HTAP database. Through a single database cluster to provide users with highly consistent distributed database services and high-performance data warehouse services, a set of integrated enterprise-level solutions is formed.
Other
1.38k stars 262 forks source link

Tuple cannot be frozen now, please try later - but it should be possible. #82

Closed yazun closed 3 years ago

yazun commented 3 years ago

We had to set higher (60000) vacuum_delta to avoid #39

Then, when we try to vacuum freeze we get on:

ERROR: XX000 - node:datanode11, backend_pid:28825, nodename:datanode9,backend_pid:42554,message:tuple cannot be frozen now, please try later xid 1172990 cutoff xid 1186959 committs 14509364588940 RecentDataTs 14496642600457 RecentGlobalXmin 1196959 RecentGlobalDataXmin 1196959

Are these two related? It's a blocker as it also causes other bugs to appear, i.e. errors like

ERROR: XX000 - node:datanode1, backend_pid:39522, nodename:datanode1,backend_pid:39522,message:relation 523409537 deleted while still in use

when trying to use a function in a create tbl as ... fn_immutable_that_does_lookup(..) in qual which are total blockers for us.

Any hints?

yazun commented 3 years ago

There are few occurrences of this error log in the code, this is the datanode msg: heap_prepare_freeze_tuple, heapam.c:7325 XX000: tuple cannot be frozen now, please try later xid 1165357 cutoff xid 1176649 committs 14511534806624 RecentDataTs 14496176693179 RecentGlobalXmin 1186649 RecentGlobalDataXmin 1186649#0122020-11-26 12:30:01 CET [10465,coord(0,0)]:xid[0-97/1178] [2-1] user=dr3_ops_cs36,db=surveys,client=192.168.168.155,query=psql the later moment never arrives..

yazun commented 3 years ago

Further info

select min(xmin::text::bigint),max(xmax::text::bigint) from tbl; -- the tbl that cannot be frozen.

#   min max
1   363986  1163851

select txid_current();
txid_current
1   867659
yazun commented 3 years ago

Update: Lowering vacuum_delta to 10000 helps with vacuum freeze problem (as we see committs was in the future in comparison with RecentDataTs), does not help with the other:

ERROR: XX000 - node:datanode1, backend_pid:39522, nodename:datanode1,backend_pid:39522,message:relation 523409537 deleted while still in use on create table .. as select...

yazun commented 3 years ago

Will start a new issue with the above problem.