facebookincubator / velox

A C++ vectorized database acceleration library aimed to optimizing query engines and data processing systems.
https://velox-lib.io/
Apache License 2.0
3.39k stars 1.11k forks source link

Velox on GPU on CI #8842

Open luhenry opened 6 months ago

luhenry commented 6 months ago

Description

Hello,

Velox is experimenting with accelerating parts of the workload on GPU with Wave [1]. When trying to build it locally, I've run into some compilation issues with the latest CUDA toolchain (12.3.1). The support is still experimental, but I've great interest in this work and would like to support it however I can. I propose to add building and possibly testing on CI to ensure that this doesn't regress in the future.

The building part is fairly easy as it would consist in 1. adding the CUDA toolchain to the CircleCI docker image [2] and 2. add a configuration that compiles with VELOX_ENABLE_GPU=ON [3].

The testing part is possible on CircleCI with their GPU runners [4]. For GitHub Actions, access to GPU is currently on a sign-up basis [5] with a waitlist.

There is also the question of how should contributors build locally when they don't have a GPU available. @assignUser has suggested [6] to add VELOX_BUILD_CUDA_TESTS or VELOX_RUN_CUDA_TESTS cmake variable.

Current contributions:

pedroerp commented 6 months ago

Cc: @assignUser @kgpai @majetideepak @Yuhta @mbasmanova

pedroerp commented 6 months ago

Thanks for looking into this!

The testing part is possible on CircleCI with their GPU runners [4]. For GitHub Actions, access to GPU is currently on a sign-up basis [5] with a waitlist.

We are slowly moving away from CircleCI into GitHub Actions, so we will need to check how we could get GPU runners there.

Cc: @kgpai @assignUser

assignUser commented 6 months ago

There are ways to add self-hosted cuda capable runners to a repo but those require a bunch of infra work (aws k8s cluster etc.) so not sure if that effort (and cost) is justified for (correct me if I am wrong) a currently niche, experimental feature.

If there is enough interest to do this a good first step might be to ask the meta OSS team if they have a way to get access to the wait listed runners?

assignUser commented 6 months ago

Oh, I did not see @kgpai's comment in the PR, looks like we have something available already!

We have 4-core-ubuntu-gpu-t4 as GPU supported runners in GHA

mbasmanova commented 6 months ago

@luhenry It would be nice to create a README or similar for folks who'd like to get started on Wave.

I also found these resources very helpful:

A short introductory blog post: https://developer.nvidia.com/blog/even-easier-introduction-cuda/

CS344: Intro to Parallel Programming course (well structured, entertaining and easy to follow, my 7 year old daughter particularly enjoyed the story about digging a hole to China): https://developer.nvidia.com/udacity-cs344-intro-parallel-programming