pyg-team / pytorch-frame

Tabular Deep Learning Library for PyTorch
https://pytorch-frame.readthedocs.io
MIT License
505 stars 53 forks source link

TypeError in `Dataset.materialize` when ser.dtype is bool #430

Closed Kh4L closed 1 month ago

Kh4L commented 1 month ago

"x-posting" from https://github.com/snap-stanford/relbench/issues/255 :

python gnn_node.py --dataset=rel-stack --task=user-engagement --epochs 20

fails in np.quantile:

Traceback (most recent call last):
  File "/workspace/relbench/examples/gnn_node.py", line 70, in <module>
    data, col_stats_dict = make_pkey_fkey_graph(
  File "/usr/local/lib/python3.10/dist-packages/relbench/modeling/graph.py", line 71, in make_pkey_fkey_graph
    dataset = Dataset(
  File "/usr/local/lib/python3.10/dist-packages/torch_frame/data/dataset.py", line 594, in materialize
    self._col_stats[col] = compute_col_stats(
  File "/usr/local/lib/python3.10/dist-packages/torch_frame/data/stats.py", line 179, in compute_col_stats
    stats = {
  File "/usr/local/lib/python3.10/dist-packages/torch_frame/data/stats.py", line 180, in <dictcomp>
    stat_type: stat_type.compute(ser.dropna(), sep)
  File "/usr/local/lib/python3.10/dist-packages/torch_frame/data/stats.py", line 107, in compute
    return np.quantile(
  File "<__array_function__ internals>", line 200, in quantile
  File "/usr/local/lib/python3.10/dist-packages/numpy/lib/function_base.py", line 4461, in quantile
    return _quantile_unchecked(
  File "/usr/local/lib/python3.10/dist-packages/numpy/lib/function_base.py", line 4473, in _quantile_unchecked
    return _ureduce(a,
  File "/usr/local/lib/python3.10/dist-packages/numpy/lib/function_base.py", line 3752, in _ureduce
    r = func(a, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/numpy/lib/function_base.py", line 4639, in _quantile_ureduce_func
    result = _quantile(arr,
  File "/usr/local/lib/python3.10/dist-packages/numpy/lib/function_base.py", line 4756, in _quantile
    result = _lerp(previous,
  File "/usr/local/lib/python3.10/dist-packages/numpy/lib/function_base.py", line 4573, in _lerp
    diff_b_a = subtract(b, a)
TypeError: numpy boolean subtract, the `-` operator, is not supported, use the bitwise_xor, the `^` operator, or the logical_xor function instead.

numpy Version: 1.24.4

yiweny commented 1 month ago

This is already fixed on master. I suggested @Kh4L to clear cache in the relbench issue.