NVIDIA / spark-rapids-benchmarks

Spark RAPIDS Benchmarks – benchmark sets and utilities for the RAPIDS Accelerator for Apache Spark
Apache License 2.0
33 stars 26 forks source link

unmatched in NDS query18 #146

Open johnnyzhon opened 1 year ago

johnnyzhon commented 1 year ago

Parameters: sf3k s3a://spark-data/nds2/parquet_sf3k_decimal

plugin: revision=faf94e423c55f020d68da77745e2748c93eb89f3 branch=HEAD date=2023-04-16T11:50:34Z url=https://github.com/NVIDIA/spark-rapids.git

Detail message: Collected 100 rows in 0.18825602531433105 seconds Row 99: ['AAAAAAAAAABBCAAA', None, None, None, Decimal('38.500000'), Decimal('195.280000'), Decimal('2456.060000'), Decimal('120.755000'), Decimal('-987.980000'), Decimal('1939.500000'), Decimal('1.000000')] ['AAAAAAAAAABBCAAA', None, None, None, Decimal('45.190476'), Decimal('75.922381'), Decimal('520.519048'), Decimal('45.438571'), Decimal('169.410952'), Decimal('1964.857143'), Decimal('2.952381')]

=== Unmatch Queries: ['query18'] ===

Query 18 in stream: -- start query 18 in stream 2 using template query75.tpl WITH all_sales AS ( SELECT d_year ,i_brand_id ,i_class_id ,i_category_id ,i_manufact_id ,SUM(sales_cnt) AS sales_cnt ,SUM(sales_amt) AS sales_amt FROM (SELECT d_year ,i_brand_id ,i_class_id ,i_category_id ,i_manufact_id ,cs_quantity - COALESCE(cr_return_quantity,0) AS sales_cnt ,cs_ext_sales_price - COALESCE(cr_return_amount,0.0) AS sales_amt FROM catalog_sales JOIN item ON i_item_sk=cs_item_sk JOIN date_dim ON d_date_sk=cs_sold_date_sk LEFT JOIN catalog_returns ON (cs_order_number=cr_order_number AND cs_item_sk=cr_item_sk) WHERE i_category='Electronics' UNION SELECT d_year ,i_brand_id ,i_class_id ,i_category_id ,i_manufact_id ,ss_quantity - COALESCE(sr_return_quantity,0) AS sales_cnt ,ss_ext_sales_price - COALESCE(sr_return_amt,0.0) AS sales_amt FROM store_sales JOIN item ON i_item_sk=ss_item_sk JOIN date_dim ON d_date_sk=ss_sold_date_sk LEFT JOIN store_returns ON (ss_ticket_number=sr_ticket_number AND ss_item_sk=sr_item_sk) WHERE i_category='Electronics' UNION SELECT d_year ,i_brand_id ,i_class_id ,i_category_id ,i_manufact_id ,ws_quantity - COALESCE(wr_return_quantity,0) AS sales_cnt ,ws_ext_sales_price - COALESCE(wr_return_amt,0.0) AS sales_amt FROM web_sales JOIN item ON i_item_sk=ws_item_sk JOIN date_dim ON d_date_sk=ws_sold_date_sk LEFT JOIN web_returns ON (ws_order_number=wr_order_number AND ws_item_sk=wr_item_sk) WHERE i_category='Electronics') sales_detail GROUP BY d_year, i_brand_id, i_class_id, i_category_id, i_manufact_id) SELECT prev_yr.d_year AS prev_year ,curr_yr.d_year AS year ,curr_yr.i_brand_id ,curr_yr.i_class_id ,curr_yr.i_category_id ,curr_yr.i_manufact_id ,prev_yr.sales_cnt AS prev_yr_cnt ,curr_yr.sales_cnt AS curr_yr_cnt ,curr_yr.sales_cnt-prev_yr.sales_cnt AS sales_cnt_diff ,curr_yr.sales_amt-prev_yr.sales_amt AS sales_amt_diff FROM all_sales curr_yr, all_sales prev_yr WHERE curr_yr.i_brand_id=prev_yr.i_brand_id AND curr_yr.i_class_id=prev_yr.i_class_id AND curr_yr.i_category_id=prev_yr.i_category_id AND curr_yr.i_manufact_id=prev_yr.i_manufact_id AND curr_yr.d_year=2002 AND prev_yr.d_year=2002-1 AND CAST(curr_yr.sales_cnt AS DECIMAL(17,2))/CAST(prev_yr.sales_cnt AS DECIMAL(17,2))<0.9 ORDER BY sales_cnt_diff,sales_amt_diff LIMIT 100;

-- end query 18 in stream 2 using template query75.tpl

gerashegalov commented 1 year ago

@johnnyzhon Can you add a repro command and some high level description. Was it a single query run?

johnnyzhon commented 1 year ago

Got this failed when run NDS2 power run testing on NGC. Paste command and configuration in following.

johnnyzhon commented 1 year ago

NGC instance : dgx1v.16g.8.norm jdk8| cuda11.5.1| ubuntu20.04

I am not sure this issue could be reproduced each time.