Fix evaluation code to improve performance

vigneshwaran commented 1 year ago

I have been experiencing llm-foundry/eval takes a lot of time compared to lm-evaluation-harness. After digging into the code, I found padding token is appended till the maximum length of the tokenizer.

inp, continuation_span = _make_padded_input(context_enc, continuation_enc, self.max_seq_len,
                                                        self.pad_tok_id)

https://github.com/bmosaicml/composer/blob/1011f90f2653dae103c3837c968071e399b1decc/composer/datasets/in_context_learning_evaluation.py#L418C1-L428C59

My proposal:

Instead of padding till max_seq_len, use the maximum length of the batch.

inp, continuation_span = _make_padded_input(context_enc, continuation_enc, max_len_of_data,
                                                        self.pad_tok_id)

This has improved latency by 400% when I used 2048 as sequence length. It would be even more for models trained with higher sequence length.

eracah commented 1 year ago

Hi, @vigneshwaran, thanks for discovering this! please make a pull request with your changes!

mvpatel2000 commented 1 year ago

Great find! If you can open a PR, we'd love to accept it :)

vigneshwaran commented 1 year ago

I have proposed my MR #2428 @eracah @mvpatel2000

vigneshwaran commented 1 year ago

In test script, batch is always expected to have fixed size of (batch_size, max_seq_len) https://github.com/mosaicml/composer/blob/b5ed4874be81bb69eca663bd8c733fcc870c315d/tests/datasets/test_in_context_learning_datasets.py#L591

Test error:

So, we need to update test files.

mosaicml / composer

Fix evaluation code to improve performance #2421

My proposal: