pydata / sparse

Sparse multi-dimensional arrays for the PyData ecosystem
https://sparse.pydata.org
BSD 3-Clause "New" or "Revised" License
604 stars 127 forks source link

ENH: Implement `broadcast_to` function #782

Closed mtsokol closed 1 month ago

mtsokol commented 1 month ago

Hi @hameerabbasi,

This PR adds broadcast_to Array API function.

It it only broadcasts missing dimensions, for 1-size dimension it can't broadcast as it's a limitation of linalg.broadcast in MLIR.

codspeed-hq[bot] commented 1 month ago

CodSpeed Performance Report

Merging #782 will degrade performances by 12.39%

Comparing broadcast_to-func (8448d56) with main (1593a0f)

Summary

⚡ 10 improvements ❌ 1 regressions ✅ 329 untouched benchmarks

:warning: _Please fix the performance issues or acknowledge them on CodSpeed._

Benchmarks breakdown

Benchmark main broadcast_to-func Change
test_index_fancy[side=100-rank=1-format='coo'] 1,417 µs 639.6 µs ×2.2
test_index_slice[side=100-rank=2-format='gcxs'] 2.4 ms 2.7 ms -12.39%
test_elemwise[f=<built-in function add>-backend='Finch'-side=1000] 1.9 ms 1.5 ms +28.29%
test_elemwise[f=<built-in function add>-backend='Finch'-side=100] 1,148.7 µs 714.3 µs +60.81%
test_elemwise[f=<built-in function add>-backend='Finch'-side=500] 1,298.4 µs 881.9 µs +47.23%
test_elemwise[f=<built-in function gt>-backend='Finch'-side=1000] 1.7 ms 1.4 ms +21.24%
test_elemwise[f=<built-in function gt>-backend='Finch'-side=100] 982.4 µs 705.8 µs +39.2%
test_elemwise[f=<built-in function gt>-backend='Finch'-side=500] 1,187.4 µs 877.5 µs +35.31%
test_elemwise[f=<built-in function mul>-backend='Finch'-side=1000] 1,149.7 µs 735.3 µs +56.35%
test_elemwise[f=<built-in function mul>-backend='Finch'-side=100] 1,110.3 µs 691.2 µs +60.64%
test_elemwise[f=<built-in function mul>-backend='Finch'-side=500] 1,127 µs 712.2 µs +58.24%
hameerabbasi commented 1 month ago

I'm hesitant about merging this as it contains lots of heuristics to figure out the output format -- not to mention there should be a "broadcast level" that's the equivalent of a "zero stride", or perhaps we can ignore the broadcast indices on the Python side.

mtsokol commented 1 month ago

I'm hesitant about merging this as it contains lots of heuristics to figure out the output format -- not to mention there should be a "broadcast level" that's the equivalent of a "zero stride", or perhaps we can ignore the broadcast indices on the Python side.

The heuristics are gone (they're already added by a different PR). What do you mean by a "broadcast level"?

hameerabbasi commented 1 month ago

What do you mean by a "broadcast level"?

Something that represents a zero-stride, or an ignored index.

mtsokol commented 1 month ago

Can you give an Array API broadcast_to(...) example call that uses ignored index? I'm not sure which feature you mean.

mtsokol commented 1 month ago

Decided that for now broadcast_to is a copying operation and Tensor views should be included on the MLIR or Finch Dialect side.

hameerabbasi commented 1 month ago

Force-merged due to the CI hiccup. Thanks, @mtsokol!