willow-ahrens / Finch.jl

Sparse tensors in Julia and more! Datastructure-driven array programing language.
http://willowahrens.io/Finch.jl/
MIT License
157 stars 15 forks source link

`RuleRewriteError` in `.//` and `rem`. `MethodError` in `prod` #474

Closed mtsokol closed 3 months ago

mtsokol commented 5 months ago

Hi @willow-ahrens!

Here are a few errors that I get for .// and rem elem-wise ops, and prod/sum for ArraySwizzle only:

using Finch

arr = [1 2 1 2; 2 1 2 1]
arr2 = arr .+ 2

tns = Tensor(Dense(Dense(Element(0))), arr)
tns2 = Tensor(Dense(Dense(Element(0))), arr2)

sw = swizzle(tns, 1, 2)
sw2 = swizzle(tns2, 1, 2)

broadcast(./, sw, sw2)  # passes
broadcast(.//, sw, sw2)  # fails with RewriteTools.RuleRewriteError
broadcast(rem, sw, sw2)  # fails with RewriteTools.RuleRewriteError

# only fails for ArraySwizzle, for Tensor works
# ERROR: MethodError: no method matching iterate(::Finch.SwizzleArray{(1, 2), Tensor{DenseLevel{Int64, DenseLevel{Int64, ElementLevel{0, Int64, Int64, Vector{Int64}}}}}})
prod(sw)
sum(sw)
willow-ahrens commented 5 months ago

for .// and rem, could you post the relevant error?

For prod and sum, I think we're just missing an overload in eager for swizzledArray

mtsokol commented 5 months ago

Sure!

Here's error log: https://gist.github.com/mtsokol/e8deaa3e44dc648fa12e4858cd97d2d8 (caused by: ArgumentError: invalid rational: zero(Int64)//zero(Int64))

willow-ahrens commented 5 months ago

okay, so we see the issue on line 34 of the gist. What should be the result of 0 // 0?

willow-ahrens commented 5 months ago

The challenge is that you are calling // on two sparse arrays, and Finch doesn't know what the default is. I'll also point out that in julia, // produces a rational number. If you wanted floor division, you could do fld. That will still give a zero division error but I think it's closer to what you want.

willow-ahrens commented 5 months ago

I think the best answer is to define our own function myfld which has the semantics we want on myfld(0, 0) = NaN, or to decide that we actually do want to enforce some sort of not-zero default when we do fld.

willow-ahrens commented 5 months ago

@mtsokol, do you think this is a bug or correct behavior?

mtsokol commented 5 months ago

Hi! I didn't have time to look at it yet - sorry! I will try to take care of it this week.

willow-ahrens commented 3 months ago

@hameerabbasi what is the fill value of the result of SparseArray // SparseArray in pydata/sparse? Should it be zero?

hameerabbasi commented 3 months ago

I guess it would be NaN but // is integer division so probably something like INT_MAX.

willow-ahrens commented 3 months ago

It's tricky to fight against the tide of whatever Julia or python has decided for these operators. In python, we have:

Python 3.12.3 (main, Apr  9 2024, 08:09:14) [Clang 15.0.0 (clang-1500.3.9.4)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> 0//0
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ZeroDivisionError: integer division or modulo by zero
>>> 0.0//0.0
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ZeroDivisionError: float floor division by zero
>>> 0.0/0.0
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ZeroDivisionError: float division by zero
>>> 0%0
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ZeroDivisionError: integer modulo by zero

In julia, we have:

julia> 0/0
NaN

julia> 0.0/0.0
NaN

julia> fld(0, 0)
ERROR: DivideError: integer division error
Stacktrace:
 [1] div
   @ ./int.jl:295 [inlined]
 [2] div
   @ ./div.jl:308 [inlined]
 [3] div
   @ ./div.jl:353 [inlined]
 [4] fld(a::Int64, b::Int64)
   @ Base ./div.jl:319
 [5] top-level scope
   @ REPL[3]:1

julia> fld(0.0, 0.0)
NaN

julia> mod(0.0, 0.0)
NaN

julia> mod(0, 0)
ERROR: DivideError: integer division error
Stacktrace:
 [1] div
   @ ./int.jl:295 [inlined]
 [2] div
   @ ./div.jl:308 [inlined]
 [3] div
   @ ./div.jl:353 [inlined]
 [4] fld
   @ ./div.jl:319 [inlined]
 [5] mod(x::Int64, y::Int64)
   @ Base ./int.jl:287
 [6] top-level scope
   @ REPL[6]:1

I think that it should be an error to try to do a sparse array divided by a sparse array, but we should support it if the fill value is nonzero. This already works, for example:

julia> A = redefault!(fsprand(5, 5, 0.5), 1.0)
SparseCOO{2} (1.0) [:,1:5]
├─ [2, 1]: 0.491575
├─ [3, 3]: 0.129094
├─ [5, 3]: 0.975127
├─ [3, 4]: 0.942453
├─ [2, 5]: 0.0867974
├─ [3, 5]: 0.292831
├─ [4, 5]: 0.331048
└─ [5, 5]: 0.364415

julia> 1 ./ A
Dense [:,1:5]
├─ [:, 1]: Dense [1:5]
│  ├─ [1]: 1.0
│  ├─ [2]: 2.03428
│  ├─ [3]: 1.0
│  ├─ [4]: 1.0
│  └─ [5]: 1.0
├─ [:, 2]: Dense [1:5]
│  ├─ [1]: 1.0
│  ├─ [2]: 1.0
│  ├─ [3]: 1.0
│  ├─ [4]: 1.0
│  └─ [5]: 1.0
├─ [:, 3]: Dense [1:5]
│  ├─ [1]: 1.0
│  ├─ [2]: 1.0
│  ├─ [3]: 7.74628
│  ├─ [4]: 1.0
│  └─ [5]: 1.02551
├─ [:, 4]: Dense [1:5]
│  ├─ [1]: 1.0
│  ├─ [2]: 1.0
│  ├─ [3]: 1.06106
│  ├─ [4]: 1.0
│  └─ [5]: 1.0
└─ [:, 5]: Dense [1:5]
   ├─ [1]: 1.0
   ├─ [2]: 11.5211
   ├─ [3]: 3.41494
   ├─ [4]: 3.02071
   └─ [5]: 2.74413

julia> default(1 ./ A)
1.0
willow-ahrens commented 3 months ago

@mtsokol I've closed as not planned since most of these involve 0//0 which is an error in python, but let me know if you feel strongly that any of these cases should work

hameerabbasi commented 3 months ago

Could you try with NumPy? A dense array (stored in a sparse format) would always fail for this op which doesn't seem right, as the fill value wouldn't ever be used.

willow-ahrens commented 3 months ago

that's true. I think we should say that a sparse array is allowed to evaluate expressions on the fill value even if the expression result isn't used. This has to do with assumptions of function purity, we assume we're allowed to evaluate the functions as many times as we want with no side effects.

willow-ahrens commented 3 months ago

I do wonder if returning an IntegerOverflowExceptionValue would be a possible approach.

hameerabbasi commented 3 months ago

np.asarray(0) // np.asarray(0) gives 0 with a RuntimeWarning, which is the behaviour that we should expect.

willow-ahrens commented 3 months ago

is it? if 0//0 errors, shouldn't the same operation inside an array error?

willow-ahrens commented 3 months ago

I guess pydata/sparse does the same thing:

>>> sparse.asarray(0) // sparse.asarray(0)
/Users/willow/Library/Caches/pypoetry/virtualenvs/bar-esaLyucf-py3.11/lib/python3.11/site-packages/sparse/numba_backend/_umath.py:522: RuntimeWarning: divide by zero encountered in floor_divide
  fill_value_array = self.func(*np.broadcast_arrays(*zero_args), dtype=self.dtype, **self.kwargs)
<COO: shape=(), dtype=int64, nnz=0, fill_value=0>
>>> 
hameerabbasi commented 3 months ago

I do agree it's not what one would usually expect for scalars, but in large arrays, failing due to one value would make little sense (as the whole result would be discarded). For this reason, this raises a warning rather than an error.

willow-ahrens commented 3 months ago

I suppose in these situations it helps to enumerate approaches:

  1. fld(0, 0) returns 0 inside Finch, and we warn (same as python, but kinda tricky because it doesn't agree with julia semantics).
  2. fld(0, 0) errors inside Finch, it's an error to fld two sparse matrices with zero fill. there's a few approaches to handling the error (keep in mind the result does need some fill value): a. Finch catches that error and recommends the use of an alternative function which does not throw. b. Finch catches that error and recommends the use of a different input fill value (like 1) c. Finch doesn't catch the error, and finch-tensor uses an alternative function for fld under the hood that does not throw.

Regarding alternative fld functions, my_fld(0, 0) could return either 0 with a warning or a special ExceptionValue that we define.

willow-ahrens commented 3 months ago

I'm really liking the idea that we would just define a new function my_fld(0, 0) inside Finch.jl which returns 0 with a warning, and recommend it's use whenever possible.

finch-tensor would use my_fld instead of fld, and my_rem instead of rem, etc. to get the desired semantics.

willow-ahrens commented 3 months ago

In discussions we realized that this is actually undefined behavior for array_api.

willow-ahrens commented 3 months ago

but since I think it would be annoying to deal with an error in this case, I'll match the numpy behavior using the approach described above.