uclaml / SPPO

The official implementation of Self-Play Preference Optimization (SPPO)
https://uclaml.github.io/SPPO/
Apache License 2.0
498 stars 62 forks source link

Flexible GPU specification in generate.sh #19

Closed xiaohangt closed 4 months ago

xiaohangt commented 4 months ago

Problem

Current data generation requires GPU ids starting from 0 and consecutive (i.e. cuda device=0..7). Specifying non-consecutive devices is necessary when using public clusters. Simple export CUDA_VISIBLE_DEVICES=... didn't work.

Changes

xiaohangt commented 4 months ago

it may be simpler to just specify CUDA_VISIBLE_DEVICES=... outside?

They do have set environment variable for cuda at top: https://github.com/uclaml/SPPO/blob/main/scripts/generate.sh#L4

But during generation they reset it with fixed GPU ids here: https://github.com/uclaml/SPPO/blob/main/scripts/generate.sh#L51,

which makes the specification at top ineffective. Have to change the fixed GPU array {0..7} here to sth we can specify. And it's not trivial since (1) there are many generated files named with GPU ids, (2) we need to adapt the frac_len to the number of devices we have.