RWKV / rwkv.cpp

INT4/INT5/INT8 and FP16 inference on CPU for RWKV language model
MIT License
1.37k stars 90 forks source link

Tutorial for python script? #122

Closed XanderTheDev closed 12 months ago

XanderTheDev commented 1 year ago

Hi, sorry. I'm a little bit of a noob, but I was wondering how to make a script in python with this. And yes I know there is an example, but I don't understand the example. I would just want to know the script and how to use it, where you can change the model, tokenizer, temperature, TOP_P, prescence penalty, frequency penalty and max tokens. And that there is just a way to give the model a prompt (via a string, like prompt = "How are you?") and then get the output just like how I gave the prompt.

So like if I had put in all the settings I just needed to do this.

prompt = "Hi, how are you?" output = model.calculate(prompt)

something like that. Just someting simple, because I don't understand the chat_with_bot.py script.

Sorry, I'm not that good at python, I hope someone can help me! Thanks already

saharNooby commented 1 year ago

Hi! I agree that chat_with_bot.py is somewhat complicated.

There is another script generate_completions.py, which is only 69 lines long, and, as I understand, does exactly what you've asked -- generates a completion by a prompt. It's almost the simplest version of inference code possible.

Answering specifically: