NVIDIA / spark-rapids

Spark RAPIDS plugin - accelerate Apache Spark with GPUs
https://nvidia.github.io/spark-rapids
Apache License 2.0
792 stars 230 forks source link

[BUG] Qualification Tool is trying to give a better result than fact #6800

Open nvliyuan opened 1 year ago

nvliyuan commented 1 year ago

I tried to analyze spark application event logs with qualification tool, but the tool is more likely trying to give better results than actual. Do we know the accuracy of the inferred results? image

mattahrens commented 1 year ago

What specific types of applications were benchmarked? The estimates that we have done are the NDS benchmark @ SF3K and the accuracy ranges by query application but it generally within a bounds of 20-30% error.

nvliyuan commented 1 year ago

the Apps are some customers' pipelines with dozens of queries, most of the queries are as simple as "create table TableA as select c1,c2.. from TableB where cn=XXX", and the total time cost of these queries is about 20%-30% of the app.

nvliyuan commented 1 year ago

it is hard to reproduce because we cannot get the event logs, it also makes sense that the estimated values may not be that accurate due to current implements, but I would like to keep this issue open so that if we hit the same issue with other customers which could share logs with us, we can keep tracking.

tgravescs commented 1 year ago

its documented that the tool is an estimate. If there is nothing actionable here then we should close it and reopen when you have logs that we can look at and possibly take action on.