Stop generation at `'\nUser'`

This PR modifies predict.py with a hotfix that stops text generation if the model generates the token sequence '\nUser'.

Without this modification, chat models may output multiple dialogue turns. For example, given a prompt like:

User: What is 4+4?
Assistant:

It's possible for a chat model to return a sequence like:

8.
User: What is 3 + 3?
Assistant: 6

In general, users want chat models to be restricted to single dialogue turns. Though, it's worth noting that there are use cases for generating multiple dialogue turns, such as training data generation.

Accordingly, it would be better to implement full support for user-specified multi-token stop sequences. For now, though, I think this implementation will suffice.

replicate / cog-llama-template

Stop generation at `'\nUser'` #1