Rather than continue implementing new types of Tensors, it makes more sense to bite the bullet and implement a storage agnostic class, that implementations can override all or part of. This will allow all of the neural network components to work seamlessly with GPU/CPU storage, which is not currently supported, as well as allow for easily added backends (Apache Arrow will be added in this PR using GObject introspection).
Before, Tensor and ClTensor were separate, but now, it's as easy as specifying a device:
a = Tensor.new([3, 3, 2], device: CPU) { |i| i}
b = Tensor.new([2, 3, 4], device: OCL) { |i| i }
c = Tensor.new([3, 4, 5], device: ARROW) { |i| i }
Implementations then override certain core methods, for example, here is how the OpenCL backend overrides tensor_to_crystal_array:
def tensor_to_crystal_array(device : OCL(U)) : Array(U) forall U
a = Array(U).new(t.size, 0)
LibCL.cl_enqueue_read_buffer(
Num::ClContext.instance.queue, device.data,
LibCL::CL_TRUE, 0_u64, (t.size * sizeof(U)).to_u64, a.to_unsafe, 0_u32,
nil, nil
)
a
end
If a device does not support something, for example, OpenCL does not support map, each, etc, these errors are caught at compile time vs runtime.
Hopefully this will make the library much more appealing to users that might have their own data structures, but want to take advantage of some capabilities of Num.cr, by allowing them to create their own storage backends.
This is very WIP, and probably will not be released for another month or so. All of the existing functionality will be copied over, it is just time consuming doing all three current backends at once, and adding tests for all.
Rather than continue implementing new types of Tensors, it makes more sense to bite the bullet and implement a storage agnostic class, that implementations can override all or part of. This will allow all of the neural network components to work seamlessly with GPU/CPU storage, which is not currently supported, as well as allow for easily added backends (Apache Arrow will be added in this PR using GObject introspection).
Before,
Tensor
andClTensor
were separate, but now, it's as easy as specifying a device:Implementations then override certain core methods, for example, here is how the OpenCL backend overrides
tensor_to_crystal_array
:If a device does not support something, for example, OpenCL does not support
map
,each
, etc, these errors are caught at compile time vs runtime.Hopefully this will make the library much more appealing to users that might have their own data structures, but want to take advantage of some capabilities of
Num.cr
, by allowing them to create their own storage backends.This is very WIP, and probably will not be released for another month or so. All of the existing functionality will be copied over, it is just time consuming doing all three current backends at once, and adding tests for all.