Open antimora opened 2 months ago
CC @nathanielsimard , @laggui , @louisfd
@LaurentMazare has confirmed Candle supports 0D tensors:
Zermelo Fraenkel: Scalars values (tensors with 0 dimension) should be supported. Empty tensors (multiple dimensions but with one of them being zero) should also be supported but only to some extent. Certainly interested if you find places where this doesn't work properly.
PyTorch supports 0D tensors
cc @ashdtu
@laggui found that Ndarray supports 0dim arrays: https://docs.rs/ndarray/latest/ndarray/type.Array0.html
@nathanielsimard and I had an offline conversation.
Here's a summary of the conversation for others:
We discussed the need to support scalar tensors in the Burn deep learning framework. While scalar values can be encoded as rank-1 tensors, the main issue is the lack of an automatic broadcasting API in Rust stable due to limitations with const generics.
As a better long-term solution, we proposed introducing a new Scalar
type, which would be an enum that can hold either a native value (e.g., f32
) or a rank-1 tensor. This explicit Scalar
type would provide more security and avoid unnecessary broadcast operations. It would also be beneficial for exporting to other formats like ONNX, since all operation can be tracked in a computation graph.
We plan to modify the burn_tensor
module to include this Scalar
type, with variants like Scalar<Int>
, Scalar<Float>
, and Scalar<Bool>
. This change would not introduce any breaking changes to the existing API.
Overall, while the naming and exact implementation details still need to be finalized, we agreed that introducing a dedicated Scalar
type is a good idea to handle scalar values properly in the Burn framework.
Motivation
Currently, the Burn deep learning framework in Rust lacks support for 0-dimensional tensors (scalars). Adding support for 0-dimensional tensors would enhance the framework's capabilities and provide several benefits:
Completeness: Supporting 0-dimensional tensors would make Burn more complete and consistent with other deep learning frameworks that already support scalars. Notably, ONNX (Open Neural Network Exchange) format, widely used for interoperability between frameworks, often deals with 0-dimensional tensors.
Simplified Operations: Many deep learning operations, such as loss functions and regularization terms, often involve scalar values. Loss functions, in particular, are crucial for training models and are typically represented as 0-dimensional tensors. Having native support for 0-dimensional tensors would simplify the implementation and usage of such operations, making it easier to compute and optimize losses during training.
Interoperability: Seamless integration with other libraries and frameworks that utilize 0-dimensional tensors would be improved, enabling smoother interoperability and data exchange. This is particularly important when working with ONNX models that frequently incorporate 0-dimensional tensors.
Reduced Workarounds: Without 0-dimensional tensor support, users may need to resort to workarounds like using 1-dimensional tensors with a single element, which can be less intuitive and efficient.
Avoiding Unnecessary Data Copying: By supporting 0-dimensional tensors directly, Burn can avoid unnecessary data copying from the device (e.g., GPU) to the host (CPU) and vice versa. This can lead to improved performance and reduced memory overhead, especially when dealing with large-scale models and datasets.
Proposed Solution
To address this limitation, we propose the following:
Tensor
struct in Burn to support 0-dimensional tensors.Benefits
By implementing support for 0-dimensional tensors, Burn will:
Potential Challenges
Next Steps
We believe that adding support for 0-dimensional tensors will significantly enhance the capabilities and usability of the Burn deep learning framework in Rust, particularly in the context of loss computation and ONNX interoperability. We look forward to feedback and collaboration from the community to make this feature a reality.