bigcode-project / bigcode-evaluation-harness

A framework for the evaluation of autoregressive code generation language models.
Apache License 2.0
698 stars 180 forks source link

why change n_copies from 1 to 2? #209

Open Reeleon opened 3 months ago

Reeleon commented 3 months ago

【Related Code】 bigcode_eval/utils.py

class TokenizedDataset(IterableDataset):
    def __iter__(self):
        ...
        if self.n_copies == 1 and self.n_tasks % self.num_devices != 0:
            self.n_copies = 2
            warnings.warn(
                "n_copies (n_samples/batch_size) was changed from 1 to 2 because n_tasks isn't proportional to num devices"
            )
        ...

【My Setting】

n_tasks = 164  # humaneval
num_devices = 5
n_samples = 1
batch_size = 1

【Results】 I get the UserWarning: n_copies (n_samples/batch_size) was changed from 1 to 2 because n_tasks isn't proportional to num devices. And the harness generated 2 samples for 163 tasks,and 4 samples for 1 tasks before removed extra predictions to only keep nsamples=1。 Why not keep n_copies=1,and just generate 1 samples for 163 tasks and 2 samples for 1 tasks?It make n_tasks = 1 * 163 + 2 * 1 = 165, which also ensure n_tasks % num_devices == 0.