Mistral CPU Demo w/ multiple prompts

boris-drazic commented 11 months ago

CPU demo errors out if more then one user input is passed in.

To replicate, on branch saichand/mistral_cpu_demo edit file models/experimental/mistral/demo/mistral_cpu_demo.py and add some more inputs to sample_context, e.g.,

    sample_context = [
        "This is a sample text for single layer execution ",
        "This is a sample text for single layer execution ",
        "This is a sample text for single layer execution ",
    ]

then run the test with pytest models/experimental/mistral/demo/mistral_cpu_demo.py::test_demo_mistral_single_layer

This is the error:

>       self.cache_k[:bsz].scatter_(dim=1, index=scatter_pos, src=xk[:, -self.sliding_window :])
E       RuntimeError: Expected index [3, 11, 8, 128] to be smaller than self [1, 4096, 8, 128] apart from dimension 1 and to be smaller size than src [3, 11, 8, 128]

models/experimental/mistral/reference/model.py:126: RuntimeError

vigneshkeerthivasanx commented 11 months ago

Earlier, as the number of prompts is increased to 3, the max_batch_size should also be changed to 3 (As max_batch_size is fixed at 1). Now we have updated the PR (#3140). We have added max_batch_size as a fixture and the number of prompts are generated w.r.t number of batch_size passed as input in the fixtures.

tt-rkim commented 11 months ago

Is this ready to be closed? @boris-drazic

boris-drazic commented 11 months ago

once merged to main it will be

saichandax commented 10 months ago

Closing this ticket as #3140 PR has been merged to main. Thank you.

tenstorrent / tt-metal

Mistral CPU Demo w/ multiple prompts #3265