databendlabs / databend

๐——๐—ฎ๐˜๐—ฎ, ๐—”๐—ป๐—ฎ๐—น๐˜†๐˜๐—ถ๐—ฐ๐˜€ & ๐—”๐—œ. Modern alternative to Snowflake. Cost-effective and simple for massive-scale analytics. https://databend.com
https://docs.databend.com
Other
7.85k stars 750 forks source link

fix: deadlock in table lock #16632

Closed zhyass closed 3 weeks ago

zhyass commented 4 weeks ago

I hereby agree to the terms of the CLA available at: https://docs.databend.com/dev/policies/cla/

Summary

Problem Overview

This PR addresses a deadlock issue that occurs when max_running_queries is set to 1 in Databend, resulting in errors like "table locked by other session." The root cause of the deadlock is the inconsistent timing of acquiring table locks during query execution, which leads to conflicts in acquiring the necessary locks.

Root Cause

The deadlock happens because queries acquire table locks at different stages of query processing:

This inconsistency creates a situation where:

Conditions That Trigger the Issue

Solution

To resolve the deadlock, this PR introduces a new lock acquisition strategy for queries that need require a table lock(merge into, recluster, update, delete, compact, truncate, replace, ..):

This change ensures a more consistent and deadlock-free process of table lock acquisition across different types of queries.

Impact of the Fix

Tests

Type of change


This change isโ€‚Reviewable

zhang2014 commented 3 weeks ago

Can we add new check for sampling queries in the is_heavy_action function to solve this?

github-actions[bot] commented 3 weeks ago

Docker Image for PR

note: this image tag is only available for internal use, please check the internal doc for more details.

github-actions[bot] commented 3 weeks ago

ClickBench Report

github-actions[bot] commented 3 weeks ago

Docker Image for PR

note: this image tag is only available for internal use, please check the internal doc for more details.

github-actions[bot] commented 3 weeks ago

ClickBench Report