After adding a new folder raw_metrics/app_id under rapids_4_spark_qualification_output, it looks like there is something broken in
# search profile sub directories for appIds
app_ids = find_paths(
prof, RegexPattern.app_id.match, return_directories=True
)
Because this means that we return the directory twice. Once under Qualification and once under Profiler
Another problematic issue in the code is that the prediction module would not find any think to load despite there is output generated by the core tools.
for dataset, input_df in processed_dfs.items():
if not input_df.empty:
# ...
else:
logger.warning('Nothing to predict for dataset %s', dataset)
2024-06-04 22:25:07,738 WARNING spark_rapids_tools.tools.model_xgboost: Nothing to predict for dataset qual_20240605032442_CD14eAde
Describe the bug
After adding a new folder
raw_metrics/app_id under rapids_4_spark_qualification_output
, it looks like there is something broken inBecause this means that we return the directory twice. Once under Qualification and once under Profiler
Another problematic issue in the code is that the prediction module would not find any think to load despite there is output generated by the core tools.