CloudSnorkel / cdk-github-runners

CDK constructs for self-hosted GitHub Actions runners
https://constructs.dev/packages/@cloudsnorkel/cdk-github-runners/
Apache License 2.0
269 stars 36 forks source link

GPU support #54

Open kichik opened 2 years ago

wesk commented 1 year ago

Does GPU support exist today for any of the providers (e.g. EC2, ECS)?

kichik commented 1 year ago

This one worked for me:

const version = '530';
const gpuImageBuilder = Ec2RunnerProvider.imageBuilder(this, 'GPU Image Builder');
gpuImageBuilder.addComponent(RunnerImageComponent.custom({
  name: 'nvidia-drivers',
  commands: [
    'export DEBIAN_FRONTEND=noninteractive',
    `apt-get install -y linux-modules-nvidia-${version}-aws nvidia-headless-${version}`,
    `apt-get install -y nvidia-utils-${version}-server || apt-get install -y nvidia-utils-${version}`,
  ],
}));

new GitHubRunners(this, 'runners', {
  providers: [
    new Ec2RunnerProvider(this, 'EC2 Linux GPU', {
      labels: ['ec2', 'linux', 'gpu'],
      instanceType: ec2.InstanceType.of(ec2.InstanceClass.G4DN, ec2.InstanceSize.XLARGE),
      imageBuilder: gpuImageBuilder,
    }),
  ],
});

For Amazon Linux 2:

const version = '530';
const gpuImageBuilder = Ec2RunnerProvider.imageBuilder(this, 'GPU Image Builder');
gpuImageBuilder.addComponent(RunnerImageComponent.custom({
  name: 'nvidia-drivers',
  commands: [
    'curl -o /etc/yum.repos.d/cuda-rhel7.repo http://developer.download.nvidia.com/compute/cuda/repos/rhel7/$( /bin/arch )/cuda-rhel7.repo',
    `yum install -y nvidia-driver-branch-${version}`,
    `amazon-linux-extras install -y kernel-ngct${version}`,
  ],
}));

Or you can use a pre-configured AMI:

const gpuImageBuilder = Ec2RunnerProvider.imageBuilder(this, 'GPU Image Builder', {
  baseAmi: ecs.EcsOptimizedImage.amazonLinux2(ecs.AmiHardwareType.GPU).getImage(this).imageId,
  os: Os.LINUX_AMAZON_2,
});

new GitHubRunners(this, 'runners', {
  providers: [
    new Ec2RunnerProvider(this, 'EC2 Linux GPU', {
      labels: ['ec2', 'linux', 'gpu'],
      instanceType: ec2.InstanceType.of(ec2.InstanceClass.G4DN, ec2.InstanceSize.XLARGE),
      imageBuilder: gpuImageBuilder,
    }),
  ],
});

Other providers require code changes or using CDK escape hatches as they will need (currently) unexposed settings.