[Enhancement] Addition of Petals model

What do you think of the idea of adding a model to support the Petals network?

Petals is a Torrent-like network where participants host smaller or larger parts of a very large language model, such as Llama2-70B, Llama1-65B, Bloom, etc. (They have even recently added support for StableBeluga2), and make them available with decent speed for inference or fine-tuning to other users connected to the network.

According to the project description, it provides opportunities to create our own sampling methods, so I think implementation should not be too much of a problem.

I think this would be quite an interesting addition to LMQL.

eth-sri / lmql

[Enhancement] Addition of Petals model #141