crystal-data / num.cr

Scientific computing in pure Crystal
MIT License
151 stars 12 forks source link

Feature request: range-based tensor creation #28

Closed jtanderson closed 4 years ago

jtanderson commented 4 years ago

Similar to numpy.arange or numpy.linspace, take a start (inclusive), end (exclusive), and delta:

puts Tensor.arange(3)
# [0, 1, 2]

puts Tensor.arange(3.0)
# [ 0.0,  1.0,  2.0]

puts Tensor.arange(3,7)
# [3, 4, 5, 6]

puts Tensor.arange(3,7,2)
# [3, 5]

puts Tensor.arange(10, 5, -2)
# [10, 8, 6]

puts Tensor.arange(15, 10, 1)
# []

puts Tensor.arange(3, 6, 0.5)
# [3.0, 3.5, 4.0, 4.5, 5.0, 5.5]

Edit: add empty-range edge case example. Edit 2: add non-integer delta example

christopherzimmerman commented 4 years ago

Similar to the other issue, this one should be (mostly) implemented. I'll put my comments below since I definitely think there is room for improvement here! The methods are not implemented on the Tensor, they are implemented as part of the Num namespace.

puts Num.arange(3)
# Already works

puts Num.arange(3.0)
# Workaround would be Num.arange(3, dtype: Float32), but would welcome a PR to allow this

puts Num.arange(3,7)
# Already works

puts Num.arange(3,7,2)
# Already works

puts Num.arange(10, 5, -2)
# No negative step supported yet, but would welcome a PR to allow it

puts Num.arange(15, 10, 1)
# This will not be allowed, I raise an error if no values would be returned.

puts Num.arange(3, 6, 0.5)
# Workaround currently is Num.arange(3, 6, 0.5, dtype: Float32), but would welcome a PR to infer dtype from step

In general, similar to numpy, when dealing with floats you want to use linspace, which I do implement (Num.linspace), although again, would welcome a PR to make the current arange more consistent.

jtanderson commented 4 years ago

Awesome! I can definitely look into getting a PR in to take care of the couple extra things in there. On a side note, I haven't really seen a statistics library for crystal. Is that something that might be of interest under the crystal-data organization? If yes, I'd be interested in starting to lay some groundwork for the basic functionality (basically working against the Matlab or numpy API as a model).

christopherzimmerman commented 4 years ago

I implement basic statistical methods such as min, max, mean, std (for entire Tensors and along axes), but not much beyond that. I think there is a definite need for that, and I would be very interested in having it included in Num.cr, maybe Num::Stats?

If it starts to become a massive module we could split it out into it's own library eventually, but to make sure it stays up to date with changes being made to Num, probably better to keep them together right now.

If you wouldn't mind opening a new issue about the statistics library, with a brief roadmap of what you would like to see implemented, we can discuss it further there.

christopherzimmerman commented 4 years ago

To close out this issue, I ended up moving the range/n-space methods onto the Tensor itself, to take advantage of generic type inference. The methods have been removed from Num.

Here is the code snippet you originally provided and its output with the current master branch:

puts Tensor.range(3)
puts Tensor.range(3.0)
puts Tensor.range(3,7)
puts Tensor.range(3,7,2)
puts Tensor.range(10, 5, -2)
puts Tensor.range(3.0, 6.0, 0.5) # All inputs must be same type here, minor change

# [0, 1, 2]
# [0, 1, 2] this is a float dtype, just represented like this because of nothing past decimal
# [3, 4, 5, 6]
# [3, 5]
# [10,  8,  6]
# [3  , 3.5, 4  , 4.5, 5  , 5.5]