xtensor-stack / xtensor

C++ tensors with broadcasting and lazy computing
BSD 3-Clause "New" or "Revised" License
3.27k stars 390 forks source link

[integration] Machine learning with xtensor via integration with Flashlight #2533

Open jacobkahn opened 2 years ago

jacobkahn commented 2 years ago

Hi XTensor community,

I maintain Flashlight, a lightweight library for machine learning used at Meta AI and elsewhere. Part of our goal with Flashlight is to spur research and consolidation of approaches to tensor computation in the machine learning setting. We have a number of active research areas in tensor computation using the Flashlight Tensor framework (think unopinionated API for adding tensor backends) as a starting point. The default tensor implementation in Flashlight is currently ArrayFire for GPU, but we're looking for a robust CPU backend that can be used concurrently.

XTensor integration into Flashlight would be powerful. It would:

  1. Bring XTensor to the machine learning setting (esp deep learning) so as to enable new applications and draw attention to the project!
  2. Show off XTensor's performance on deep learning workloads for both training and inference
  3. Enable high-performance CPU implementations of Tensor operations in Flashlight!
  4. Profile and study performance on machine/deep learning workloads to identify potential upstream improvements in XTensor

Integration should be quite easy given the Flashlight Tensor API is deliberately structured to closely mirror numpy's. The documentation details more, but adding an XTensor backend would involve:

  1. Copying the StubBackend and StubTensor as starting points into a new backend dir (flashlight/fl/tensor/backend/xtensor)
  2. Adding build support for finding/including XTensor (i.e. CMake)
  3. Shim XTensor ops!
  4. Test/profit!

I'm supporting integrations with Flashlight full-time, so I'd be here to help and make any changes as needed to support this. Happy to answer questions/discuss further!

tdegeus commented 2 years ago

It sounds great! Indeed I see a mutual benefit: Flashlight can rely on a well-developed CPU backend with convenient NumPy-like syntax, and xtensor benefits from benchmarking which could lead to optimisations.

However, I wonder if this issue targets the right audience ;) : as long as I am not a (potential) user of Flashlight I will most likely not invest time in starting the integration. I would say that those people should be found in the (potential) user-pool of Flashlight. For that, I would suggest a similar discussion on the appropriate platform in use for Flashlight. We as xtensor users/maintainers/developers could of course help where needed: specific questions/problems/needs would probably get addressed swiftly.