Before submitting a bug, please make sure the issue hasn't been already addressed by searching through the FAQs and existing/past issues
Describe the bug
<Please provide a clear and concise description of what the bug is. If relevant, please include a minimal (least lines of code necessary) reproducible (running this will give us the same result as you get) code snippet. Make sure to include the relevant imports.>
Minimal reproducible example
<Remember to wrap the code in ```triple-quotes blocks```>
# sample code to repro the bug
Output
<Remember to wrap the output in ```triple-quotes blocks```>
<paste stacktrace and other outputs here>
Runtime Environment
Model: [eg: llama-2-7b-chat]
Using via huggingface?: [yes/no]
OS: [eg. Linux/Ubuntu, Windows]
GPU VRAM:
Number of GPUs:
GPU Make: [eg: Nvidia, AMD, Intel]
Additional context
Add any other context about the problem or environment here.
Before submitting a bug, please make sure the issue hasn't been already addressed by searching through the FAQs and existing/past issues
Describe the bug
<Please provide a clear and concise description of what the bug is. If relevant, please include a minimal (least lines of code necessary) reproducible (running this will give us the same result as you get) code snippet. Make sure to include the relevant imports.>
Minimal reproducible example
<Remember to wrap the code in
```triple-quotes blocks```
>Output
<Remember to wrap the output in
```triple-quotes blocks```
>Runtime Environment
llama-2-7b-chat
]Additional context Add any other context about the problem or environment here.