NVIDIA / spark-rapids-tools

User tools for Spark RAPIDS
Apache License 2.0
44 stars 34 forks source link

[BUG] unsupportedoperators.csv shows stageID=-1 for certain unsupported operator #1156

Open viadea opened 1 week ago

viadea commented 1 week ago

Describe the bug unsupportedoperators.csv shows stageID=-1 for certain unsupported operator.

Does it mean Qual tool could not figure out which stage is associated with certain unsupported operators? As a result, Qual tool thinks the % of unsupported duration is very low which could be wrong.

amahussein commented 4 days ago

After taking a look at the eventlog. The get_json_object appears as an expression of project Currently, we can link project to a stageID iff it is contained inside WholestageCodeGen because the ltter has metrics that can be linked into stageID.

There is no clear path to work around this. We can try adding some heuristics that link an exec to a stage based on the neighboring expressions, but then we need to come up with a well defined strategy for that. Otherwise, it will be come a big mess of heuristics that's hard to understand.

amahussein commented 3 days ago

We need to investigate further by checking the SHS code that parses the RDD information inside a stage. There might be some further information about linkage between the execs and their stages. This concern has been raised before in https://github.com/NVIDIA/spark-rapids-tools/issues/794