Describe the bug
We are using alluxio catalog service to cache hdfs files, and it did improve query speed in presto. But sometimes presto unable to get data nomally, the Bytes/s and Rows/s still changing, but Bytes and Rows no longer increases, it will pending until the query timeout.
The gif is as follows:
We speculate that it is stuck when reading some alluxio files. Because if we query hive instead of alluxio catalog service, this problem will not occur.
We don't know which files the problem occurs when reading, so we can't locate which alluxio worker has the exception.
We tried sync as well as free -f command, but it didn't solve the problem because some query on this table is still running, the file may hold a lock.
Only after restarting the cluster can unlock the file and reload the file, can be restored this problem. But the problem always keep recurring.
To Reproduce
Steps to reproduce the behavior (as minimally and precisely as possible)
Expected behavior
A clear and concise description of what you expected to happen.
Urgency
Describe the impact and urgency of the bug.
Are you planning to fix it
Please indicate if you are already working on a PR.
Additional context
Add any other context about the problem here.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed in two weeks if no further activity occurs. Thank you for your contributions.
Alluxio Version: 2.7.3
Describe the bug We are using alluxio catalog service to cache hdfs files, and it did improve query speed in presto. But sometimes presto unable to get data nomally, the
Bytes/s
andRows/s
still changing, butBytes
andRows
no longer increases, it will pending until the query timeout. The gif is as follows:We speculate that it is stuck when reading some alluxio files. Because if we query hive instead of alluxio catalog service, this problem will not occur. We don't know which files the problem occurs when reading, so we can't locate which alluxio worker has the exception.
We tried
sync
as well asfree -f
command, but it didn't solve the problem because some query on this table is still running, the file may hold a lock. Only after restarting the cluster can unlock the file and reload the file, can be restored this problem. But the problem always keep recurring.To Reproduce Steps to reproduce the behavior (as minimally and precisely as possible)
Expected behavior A clear and concise description of what you expected to happen.
Urgency Describe the impact and urgency of the bug.
Are you planning to fix it Please indicate if you are already working on a PR.
Additional context Add any other context about the problem here.