Closed mattahrens closed 4 days ago
@mattahrens do we still need this issue? Currently we skip Photon jobs in the Qualification tool.
This still might be prioritized in the future so we can keep it open
Discussed the next steps for Photon integration into QualX with @leewyang and @eordentlich.
Assumptions:
Solution:
QualX
spark_properties.csv
and based on the type for FIRST app, load the relevant model (photon or spark).Python Tools CLI
spark_properties.csv
and based on the type for FIRST app, select the strategy to use for speed up categorization for all event logs:
Alternatives:
spark_properties.csv
and pass the Photon/Spark
type as an additional column to QualX. cc: @amahussein @tgravescs
Agreed that heterogenous support makes sense, but can that be done in a follow-up PR? I don't think it's needed in this first iteration.
Sure Matt. This would make QualX simpler. Updated the approach. We can add heterogenous support if needed later
Users do not provide heterogenous event logs
Are we going to fail or warn if we recognize this happening? I think a lot of companies will have mixed eventlogs.
Eventually we would want to add support for mixed set. This approach is mainly to simplify the development process and proceed iteratively.
Both approaches have pros and cons.
Pros: Users do not get incorrect recommendation Cons: User experience may be compromised
Pros: User experience is better. There are no failures Cons: Users will get unexpected recommendation. It can cause silent errors/warnings.
IMO, Approach 1 makes more sense. Although, the user experience is compromised, any unexpected or silent errors will be avoided.
What is the expected time frame to add the heterogenous, if we are going to add soon then it might not matter to much.
We could always choose whatever the first eventlog has and log it, then if we come to one that is of the opposite type, we skip running on that eventlog but make sure we mark it as skipped because of this condition so that we try to make it obvious to the user. The question is do we make it obvious enough if skipping it.
From development perspective, adding support for heterogenous would be a small change in the Python tools side.
@leewyang Would it be feasible for QualX to support heterogenous event logs (photon + spark) easily? If yes, then we can directly add heterogenous support.
@parthosa We'd just need something that we could parse that identifies each uniquely. As you mentioned earlier, I think we could just parse the spark_properties.csv
and add an indicator to our profile/features dataframe, then we'd group/filter by that indicator before loading the associated qualx model and running prediction. The trickiest part would be reconstructing the correct order (if required) by stitching the two results (for photon and spark) back together, but I think it's doable.
That's great then.
trickiest part would be reconstructing the correct order
Ordering should not be a problem since we do a left join between output DF from JAR and resulting DF from QualX based on App Id
@mattahrens: Since it is quite feasible from both QualX and Python Tools to add support for heterogenous support, we should directly proceed to this instead of an intermediate stage that will be eventually modified.
Closing this issue as all subtasks for adding support for Photon event logs have been completed.
To run the tool with Photon event logs, use the following command:
spark_rapids qualification --platform databricks-aws --eventlogs <photon-event-log>
I would like to see estimated speedup on GPU compared against Databricks Photon. This work will include parsing Databricks Photon event logs and then generating speedup factors for Photon operators to Spark RAPIDS operators.