How can we restrict Vicuna, so it can generate the text in expected format that i want?

alan-ai-learner commented 1 year ago

Hi @infwinston @Mearman @zhisbug @jegonzal @Shawnlu25 I'm trying to generate meeting minutes using vicuna-13b, using a chunk from my meeting transcript (due to context size restrictions I'm creating chunks of the meeting transcripts and passing one by one.) Here is the expected format i want and it is generated by the vicuna ..


Topics Discussed:

* Language Model and Meeting Summary Generation
* Input and Output Sizes of Language Model
* Fine-Tuning Data Sets
* Average Size of Meeting Minutes

Meeting Summary:

XX and YY discussed the Language Model and Meeting Summary Generation. XX asked YY to explain the input and output sizes of the Language Model and the reason for restricting the output to 200 tokens.

Action Points:

1. XX will show Alankar the current size of the minutes that have been generated.
2. YY will generate a summary for the internal team meetings that were attended by Sarma.

But this behaviour changes when i'm passing next chunk of the transcript and so on...

Is it possible to restrict vicuna to generate the minutes only this way?
Can finetuning help?
Max context len is 2048 (it is like gp3? like in gpt 3 the context length is 4096 which is prompt + output) , if you can clarify that will be great. Any help would be great.. thanks!

biosfood commented 1 year ago

Have you tried running the model multiple times but just querying one specific part of the format and then combining them afterwards? Example:

Query: Create a list of discussed topics in the meeting, push that to your output file
Query: Write a meeting summary, push that to the file
Query: create an ordered list of action points that were settled during the meeting, push that to your file

I haven't tested these prompts, so you might have to do some more testing.

I would guess that a language model will be better at completing your smaller tasks if it doesn't have to 'keep track' of all of your requirements and will give more consistent output. You might have to add something like "only write bullet points" to prevent a small text from generating before your desired output.

As far as I can tell from the code, context length is the maximum number of token the model sees and combines both prompts and past answers are put in, so prompt + output as with gpt3, but shorter.

alan-ai-learner commented 1 year ago

@biosfood thank you so much for answering! I'll try this and let you know , how it went.

alan-ai-learner commented 1 year ago

Have you tried running the model multiple times but just querying one specific part of the format and then combining them afterwards?

@biosfood , As my trasnscripts are longer than the max context length, I created chunks of the transcripts. I'm trying to generate all three things (topics, summary, action points) for each chunk at once, and the end i'm combining them.

Working on your suggesstion.

lm-sys / FastChat

How can we restrict Vicuna, so it can generate the text in expected format that i want? #516