Support for 0-Dimensional Tensors in Burn

antimora commented 2 months ago

Motivation

Currently, the Burn deep learning framework in Rust lacks support for 0-dimensional tensors (scalars). Adding support for 0-dimensional tensors would enhance the framework's capabilities and provide several benefits:

Completeness: Supporting 0-dimensional tensors would make Burn more complete and consistent with other deep learning frameworks that already support scalars. Notably, ONNX (Open Neural Network Exchange) format, widely used for interoperability between frameworks, often deals with 0-dimensional tensors.
Simplified Operations: Many deep learning operations, such as loss functions and regularization terms, often involve scalar values. Loss functions, in particular, are crucial for training models and are typically represented as 0-dimensional tensors. Having native support for 0-dimensional tensors would simplify the implementation and usage of such operations, making it easier to compute and optimize losses during training.
Interoperability: Seamless integration with other libraries and frameworks that utilize 0-dimensional tensors would be improved, enabling smoother interoperability and data exchange. This is particularly important when working with ONNX models that frequently incorporate 0-dimensional tensors.
Reduced Workarounds: Without 0-dimensional tensor support, users may need to resort to workarounds like using 1-dimensional tensors with a single element, which can be less intuitive and efficient.
Avoiding Unnecessary Data Copying: By supporting 0-dimensional tensors directly, Burn can avoid unnecessary data copying from the device (e.g., GPU) to the host (CPU) and vice versa. This can lead to improved performance and reduced memory overhead, especially when dealing with large-scale models and datasets.

Proposed Solution

To address this limitation, we propose the following:

Extend the Tensor struct in Burn to support 0-dimensional tensors.
Implement necessary methods and traits for creating, manipulating, and operating on 0-dimensional tensors.
Update relevant functions and operations to handle 0-dimensional tensors correctly, with a focus on loss computation and optimization.
Ensure proper broadcasting and type promotion rules are followed when mixing 0-dimensional tensors with higher-dimensional tensors.
Add comprehensive unit tests to verify the correctness and consistency of 0-dimensional tensor support, including tests specifically related to loss functions.
Update the documentation and examples to showcase the usage of 0-dimensional tensors, particularly in the context of loss computation and ONNX interoperability.

Benefits

By implementing support for 0-dimensional tensors, Burn will:

Provide a more complete and consistent API for tensor operations, aligning with ONNX and other frameworks.
Simplify the implementation of common deep learning operations involving scalars, especially loss functions.
Enhance interoperability with other libraries and frameworks, particularly when working with ONNX models.
Improve usability and reduce the need for workarounds.
Optimize performance by avoiding unnecessary data copying between devices.

Potential Challenges

Ensuring backward compatibility with existing code and models.
Handling edge cases and maintaining consistency with broadcasting rules.
Optimizing performance for operations involving 0-dimensional tensors, especially in the context of loss computation.

Next Steps

Discuss and refine the proposed solution with the Burn community.
Create a detailed implementation plan and allocate resources.
Implement the necessary changes and additions to support 0-dimensional tensors, with a focus on loss computation and ONNX compatibility.
Conduct thorough testing and address any issues or edge cases, including tests for loss functions and ONNX interoperability.
Update the documentation and examples, highlighting the usage of 0-dimensional tensors in loss computation and ONNX scenarios.
Release a new version of Burn with 0-dimensional tensor support.

We believe that adding support for 0-dimensional tensors will significantly enhance the capabilities and usability of the Burn deep learning framework in Rust, particularly in the context of loss computation and ONNX interoperability. We look forward to feedback and collaboration from the community to make this feature a reality.

antimora commented 2 months ago

CC @nathanielsimard , @laggui , @louisfd

antimora commented 2 months ago

@LaurentMazare has confirmed Candle supports 0D tensors:

Zermelo Fraenkel: Scalars values (tensors with 0 dimension) should be supported. Empty tensors (multiple dimensions but with one of them being zero) should also be supported but only to some extent. Certainly interested if you find places where this doesn't work properly.

PyTorch supports 0D tensors

antimora commented 2 months ago

cc @ashdtu

antimora commented 2 months ago

@laggui found that Ndarray supports 0dim arrays: https://docs.rs/ndarray/latest/ndarray/type.Array0.html

antimora commented 2 months ago

@nathanielsimard and I had an offline conversation.

Here's a summary of the conversation for others:

We discussed the need to support scalar tensors in the Burn deep learning framework. While scalar values can be encoded as rank-1 tensors, the main issue is the lack of an automatic broadcasting API in Rust stable due to limitations with const generics.

As a better long-term solution, we proposed introducing a new Scalar type, which would be an enum that can hold either a native value (e.g., f32) or a rank-1 tensor. This explicit Scalar type would provide more security and avoid unnecessary broadcast operations. It would also be beneficial for exporting to other formats like ONNX, since all operation can be tracked in a computation graph.

We plan to modify the burn_tensor module to include this Scalar type, with variants like Scalar<Int>, Scalar<Float>, and Scalar<Bool>. This change would not introduce any breaking changes to the existing API.

Overall, while the naming and exact implementation details still need to be finalized, we agreed that introducing a dedicated Scalar type is a good idea to handle scalar values properly in the Burn framework.

tracel-ai / burn