rapidsai / gpu-bdb

RAPIDS GPU-BDB
Apache License 2.0
108 stars 44 forks source link

Multiple queries failing in 2021-03-16 nightlies #195

Closed beckernick closed 3 years ago

beckernick commented 3 years ago

Failing queries:

Tracebacks

Q02

Encountered Exception while running query
Traceback (most recent call last):
  File "/raid/nicholasb/miniconda3/envs/rapids-gpu-bdb-automated-tests/lib/python3.7/site-packages/dask/dataframe/utils.py", line 180, in raise_on_meta_error
    yield
  File "/raid/nicholasb/miniconda3/envs/rapids-gpu-bdb-automated-tests/lib/python3.7/site-packages/dask/dataframe/core.py", line 5332, in _emulate
    return func(*_extract_meta(args, True), **_extract_meta(kwargs, True))
  File "queries/q02/gpu_bdb_query_02.py", line 55, in reduction_function
    df, keep_cols=["wcs_user_sk", "wcs_item_sk"], time_out=q02_session_timeout_inSec
  File "/raid/nicholasb/prod/gpu-bdb/gpu_bdb/bdb_tools/sessionization.py", line 92, in get_distinct_sessions
    df = get_sessions(df, keep_cols, time_out=3600)
  File "/raid/nicholasb/prod/gpu-bdb/gpu_bdb/bdb_tools/sessionization.py", line 79, in get_sessions
    df["session_id"] = get_session_id(df, keep_cols, time_out)
  File "/raid/nicholasb/prod/gpu-bdb/gpu_bdb/bdb_tools/sessionization.py", line 73, in get_session_id
    assert len(session_ids) == len(df)
AssertionError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/raid/nicholasb/prod/gpu-bdb/gpu_bdb/bdb_tools/utils.py", line 280, in run_dask_cudf_query
    config=config,
  File "/raid/nicholasb/prod/gpu-bdb/gpu_bdb/bdb_tools/utils.py", line 61, in benchmark
    result = func(*args, **kwargs)
  File "queries/q02/gpu_bdb_query_02.py", line 133, in main
    grouped_df = f_wcs_df.map_partitions(reduction_function, q02_session_timeout_inSec)
  File "/raid/nicholasb/miniconda3/envs/rapids-gpu-bdb-automated-tests/lib/python3.7/site-packages/dask/dataframe/core.py", line 684, in map_partitions
    return map_partitions(func, self, *args, **kwargs)
  File "/raid/nicholasb/miniconda3/envs/rapids-gpu-bdb-automated-tests/lib/python3.7/site-packages/dask/dataframe/core.py", line 5385, in map_partitions
    meta = _emulate(func, *args, udf=True, **kwargs)
  File "/raid/nicholasb/miniconda3/envs/rapids-gpu-bdb-automated-tests/lib/python3.7/site-packages/dask/dataframe/core.py", line 5332, in _emulate
    return func(*_extract_meta(args, True), **_extract_meta(kwargs, True))
  File "/raid/nicholasb/miniconda3/envs/rapids-gpu-bdb-automated-tests/lib/python3.7/contextlib.py", line 130, in __exit__
    self.gen.throw(type, value, traceback)
  File "/raid/nicholasb/miniconda3/envs/rapids-gpu-bdb-automated-tests/lib/python3.7/site-packages/dask/dataframe/utils.py", line 201, in raise_on_meta_error
    raise ValueError(msg) from e
ValueError: Metadata inference failed in `reduction_function`

Q04

Encountered Exception while running query
Traceback (most recent call last):
  File "/raid/nicholasb/miniconda3/envs/rapids-gpu-bdb-automated-tests/lib/python3.7/site-packages/dask/dataframe/utils.py", line 180, in raise_on_meta_error
    yield
  File "/raid/nicholasb/miniconda3/envs/rapids-gpu-bdb-automated-tests/lib/python3.7/site-packages/dask/dataframe/core.py", line 5332, in _emulate
    return func(*_extract_meta(args, True), **_extract_meta(kwargs, True))
  File "queries/q04/gpu_bdb_query_04.py", line 102, in reduction_function
    df = get_sessions(df, keep_cols=keep_cols)
  File "/raid/nicholasb/prod/gpu-bdb/gpu_bdb/bdb_tools/sessionization.py", line 79, in get_sessions
    df["session_id"] = get_session_id(df, keep_cols, time_out)
  File "/raid/nicholasb/prod/gpu-bdb/gpu_bdb/bdb_tools/sessionization.py", line 73, in get_session_id
    assert len(session_ids) == len(df)
AssertionError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/raid/nicholasb/prod/gpu-bdb/gpu_bdb/bdb_tools/utils.py", line 280, in run_dask_cudf_query
    config=config,
  File "/raid/nicholasb/prod/gpu-bdb/gpu_bdb/bdb_tools/utils.py", line 61, in benchmark
    result = func(*args, **kwargs)
  File "queries/q04/gpu_bdb_query_04.py", line 155, in main
    reduction_function, keep_cols, DYNAMIC_CAT_CODE, ORDER_CAT_CODE
  File "/raid/nicholasb/miniconda3/envs/rapids-gpu-bdb-automated-tests/lib/python3.7/site-packages/dask/dataframe/core.py", line 684, in map_partitions
    return map_partitions(func, self, *args, **kwargs)
  File "/raid/nicholasb/miniconda3/envs/rapids-gpu-bdb-automated-tests/lib/python3.7/site-packages/dask/dataframe/core.py", line 5385, in map_partitions
    meta = _emulate(func, *args, udf=True, **kwargs)
  File "/raid/nicholasb/miniconda3/envs/rapids-gpu-bdb-automated-tests/lib/python3.7/site-packages/dask/dataframe/core.py", line 5332, in _emulate
    return func(*_extract_meta(args, True), **_extract_meta(kwargs, True))
  File "/raid/nicholasb/miniconda3/envs/rapids-gpu-bdb-automated-tests/lib/python3.7/contextlib.py", line 130, in __exit__
    self.gen.throw(type, value, traceback)
  File "/raid/nicholasb/miniconda3/envs/rapids-gpu-bdb-automated-tests/lib/python3.7/site-packages/dask/dataframe/utils.py", line 201, in raise_on_meta_error
    raise ValueError(msg) from e
ValueError: Metadata inference failed in `reduction_function`.

8

Encountered Exception while running query
Traceback (most recent call last):
  File "/raid/nicholasb/prod/gpu-bdb/gpu_bdb/bdb_tools/utils.py", line 280, in run_dask_cudf_query
    config=config,
  File "/raid/nicholasb/prod/gpu-bdb/gpu_bdb/bdb_tools/utils.py", line 61, in benchmark
    result = func(*args, **kwargs)
  File "queries/q08/gpu_bdb_query_08.py", line 308, in main
    q08_reviewed_sales_sum.result(),
  File "/raid/nicholasb/miniconda3/envs/rapids-gpubdb-20210316/lib/python3.7/site-packages/distributed/client.py", line 222, in result
    raise exc.with_traceback(tb)
  File "/raid/nicholasb/miniconda3/envs/rapids-gpubdb-20210316/lib/python3.7/site-packages/dask/optimization.py", line 963, in __call__
    return core.get(self.dsk, self.outkey, dict(zip(self.inkeys, args)))
  File "/raid/nicholasb/miniconda3/envs/rapids-gpubdb-20210316/lib/python3.7/site-packages/dask/core.py", line 151, in get
    result = _execute_task(task, cache)
  File "/raid/nicholasb/miniconda3/envs/rapids-gpubdb-20210316/lib/python3.7/site-packages/dask/core.py", line 121, in _execute_task
    return func(*(_execute_task(a, cache) for a in args))
  File "/raid/nicholasb/miniconda3/envs/rapids-gpubdb-20210316/lib/python3.7/site-packages/dask/utils.py", line 35, in apply
    return func(*args, **kwargs)
  File "/raid/nicholasb/miniconda3/envs/rapids-gpubdb-20210316/lib/python3.7/site-packages/dask/dataframe/core.py", line 5487, in apply_and_enforce
    df = func(*args, **kwargs)
  File "queries/q08/gpu_bdb_query_08.py", line 209, in reduction_function
    df = get_sessions(df)
  File "queries/q08/gpu_bdb_query_08.py", line 138, in get_sessions
    df["session_id"] = get_session_id(df)
  File "queries/q08/gpu_bdb_query_08.py", line 130, in get_session_id
    assert len(session_ids) == len(df)
AssertionError
beckernick commented 3 years ago
diff 20210315-env.txt 20210316-env.txt
46c46
< cudf=0.19.0a210315=cuda_10.2_py37_g325d5b800b_212
---
> cudf=0.19.0a210316=cuda_10.2_py37_g2f5901ffb4_216
48c48
< cuml=0.19.0a210315=cuda10.2_py37_gf5d86b957_106
---
> cuml=0.19.0a210316=cuda10.2_py37_g96eaf623e_109
57,58c57,58
< dask-cuda=0.19.0a210315=py37_41
< dask-cudf=0.19.0a210315=py37_g325d5b800b_212
---
> dask-cuda=0.19.0a210316=py37_42
> dask-cudf=0.19.0a210316=py37_g2f5901ffb4_216
102c102
< jupyter-server-proxy=3.0.0=pypi_0
---
> jupyter-server-proxy=3.0.1=pypi_0
116,117c116,117
< libcudf=0.19.0a210315=cuda10.2_g325d5b800b_212
< libcuml=0.19.0a210315=cuda10.2_gf5d86b957_106
---
> libcudf=0.19.0a210316=cuda10.2_g2f5901ffb4_216
> libcuml=0.19.0a210316=cuda10.2_g96eaf623e_109
139c139
< librmm=0.19.0a210315=cuda10.2_gcb81c80_40
---
> librmm=0.19.0a210316=cuda10.2_gdd718e2_41
219c219
< rmm=0.19.0a210315=cuda_10.2_py37_gcb81c80_40
---
> rmm=0.19.0a210316=cuda_10.2_py37_gdd718e2_41
254c254
< ucx-py=0.19.0a210315=py37_gcd9efd3_19
---
> ucx-py=0.19.0a210316=py37_gcd9efd3_20
beckernick commented 3 years ago

Likely due to https://github.com/rapidsai/cudf/pull/7490

Likely will need to refactor queries for the updated null handling

beckernick commented 3 years ago

In the failing commit, we get the following metadata dataframe inside get_session_id after the following lines:

https://github.com/rapidsai/gpu-bdb/blob/8541f2841cc09d44579f1935d8e17ab956fc218a/gpu_bdb/bdb_tools/sessionization.py#L52-L56

   wcs_user_sk  wcs_item_sk  tstamp_inSec  ... time_delta session_timeout_flag session_change_flag
0            0            0             0  ...       <NA>                 <NA>                <NA>
1            1            1             1  ...          1                False                True

In the prior succeeding commit:

   wcs_user_sk  wcs_item_sk  tstamp_inSec  ...  time_delta session_timeout_flag  session_change_flag
0            0            0             0  ...        <NA>                False                 True
1            1            1             1  ...           1                False                 True
beckernick commented 3 years ago

Local fix passing correctness checks. Will push a PR soon