Closed geo47 closed 2 years ago
you can add the personas directly to the message, e.g. it would look like the following:
{"id": "partner1", "text": "hello how are you today?", "personas": ['your persona: bot_persona1', 'your persona: bot_persona2']}
Hi,
Thanks for you response.
I have a few more queries....
The problem looks like this:
Creating 2 bots each having a different personas.
Bot1: {"id": "partner1", "text": "tell me a joke.", "label": One time, I put strawberry jam on my burger. I thought it was ketchup!", "personas": ['your persona: I am John', 'your persona: I live in Ohio.']}
Bot2: {"id": "partner1", "text": "tell me a joke.", "label": "Let me tell you my favorite joke. Why was six afraid of seven? Because seven ate nine!", "personas": ['your persona: I am Ellie', 'your persona: I live in New York.']}
Given user input with Bot persona: 'your persona: Bot 1 persona\n user_input', the model should generate the response according to the given Bot persona.
input: "your persona: I am Ellie\n I live in New York.\n tell me a joke."
output: "Let me tell you my favorite joke. Why was six afraid of seven? Because seven ate nine!"
Is there any length restrictions for defining the personas in the message you mentioned above?
For every input-output samples, do we have to add the same personas?
{"id": "partner1", "text": "hello how are you today?", "personas": ['your persona: bot_persona1', 'your persona: bot_persona2']}
Does the following JSON format is correct for making the training dataset?
{
"dialog": [
{
"id": "partner1",
"text": "tell me a joke.",
"label": "One time, I put strawberry jam on my burger. I thought it was ketchup!",
"personas": [
"your persona: I am John",
"your persona: I live in Ohio."
],
"label_candidates": [
"One time, I put strawberry jam on my burger. I thought it was ketchup!",
"Let me tell you my favorite joke. Why was six afraid of seven? Because seven ate nine!"
],
"episode_done": true
},
{
"id": "partner1",
"text": "tell me a joke.",
"label": "Let me tell you my favorite joke. Why was six afraid of seven? Because seven ate nine!",
"personas": [
"your persona: I am Ellie",
"your persona: I live in New York."
],
"label_candidates": [
"One time, I put strawberry jam on my burger. I thought it was ketchup!",
"Let me tell you my favorite joke. Why was six afraid of seven? Because seven ate nine!"
],
"episode_done": true
}
]
}
Also, currently I have successfully trained the BB2 model on custom dataset along with blended_skill_talk
, convai2
, and custom_dataset.txt
as follows:
parlai train_model --model transformer/generator \
--task blended_skill_talk,convai2,fromfile:parlaiformat --fromfile_datapath parlai_dataset/archi_test.txt \
--multitask-weights 1,3,3,3 --num-epochs 5 \
--init-model zoo:blender/blender_90M/model \
--dict-file zoo:blender/blender_90M/model.dict \
--embedding-size 512 --n-layers 8 --ffn-size 2048 --dropout 0.1 --n-heads 16 \
--learn-positional-embeddings True --n-positions 512 --variant xlm --activation gelu --fp16 True \
--text-truncate 512 --label-truncate 128 --dict-tokenizer bpe --dict-lower True -lr 1e-06 \
--optimizer adamax --lr-scheduler reduceonplateau --gradient-clip 0.1 -veps 0.25 --betas 0.9,0.999 \
--update-freq 1 --attention-dropout 0.0 --relu-dropout 0.0 --skip-generation True -vp 15 -stim 60 \
--vme 20000 -bs 16 -vmt ppl -vmm min --save-after-valid True --model-file /projects/ParlAI/models/odkg_model
Thanks...
-m --model projects.blenderbot2.agents.blenderbot2:BlenderBot2FidAgent
; see e.g. a sample command in #4347 for more parameter details (however, you'll want to keep the architectural specifics you already have here if using the 90m model)Hi @klshuster , Thanks for your response.
After adding personas to my dataset when I train the model using the command I run before, it seems not learning the personas context.
Then, I replaced the model from transformer/generator
to projects.blenderbot2.agents.blenderbot2:BlenderBot2FidAgent
and added --memory-key full_text
key with the the same setting as previous:
parlai train_model --model projects.blenderbot2.agents.blenderbot2:BlenderBot2FidAgent \
--task blended_skill_talk,convai2,fromfile:parlaiformat --fromfile_datapath parlai_dataset/archi_test.txt \
--multitask-weights 1,3,3,3 --num-epochs 5 \
--init-model zoo:blender/blender_90M/model \
--dict-file zoo:blender/blender_90M/model.dict \
--memory-key full_text
--embedding-size 512 --n-layers 8 --ffn-size 2048 --dropout 0.1 --n-heads 16 \
--learn-positional-embeddings True --n-positions 512 --variant xlm --activation gelu --fp16 True \
--text-truncate 512 --label-truncate 128 --dict-tokenizer bpe --dict-lower True -lr 1e-06 \
--optimizer adamax --lr-scheduler reduceonplateau --gradient-clip 0.1 -veps 0.25 --betas 0.9,0.999 \
--update-freq 1 --attention-dropout 0.0 --relu-dropout 0.0 --skip-generation True -vp 15 -stim 60 \
--vme 20000 -bs 16 -vmt ppl -vmm min --save-after-valid True --model-file /projects/ParlAI/models/odkg_model
It gives the following error:
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for BlenderBot2FidModel:
Missing key(s) in state_dict: "seq2seq_encoder.embeddings.weight"
When using the command from above issue with --init-model zoo:blenderbot2/blenderbot2_90M/model --dict-file zoo:blenderbot2/blenderbot2_90M/model.dict
:
parlai train_model -dp data \
--model projects.blenderbot2.agents.blenderbot2:BlenderBot2FidAgent \
--task fromfile:parlaiformat --fromfile_datapath dataset/archi_parlai.txt \
--num_epochs 20 \
--memory-decoder-model-file "" --memory-key full_text \
--search-query-generator-model-file zoo:blenderbot2/query_generator/model --search-query-generator-beam-min-length 2 \
--save-every-n-secs 600 --validation_every_n_secs 600 --log_every_n_secs 60 \
--init-model zoo:blenderbot2/blenderbot2_90M/model --dict-file zoo:blenderbot2/blenderbot2_90M/model.dict \
--datatype train:stream \
--embeddings-scale True --variant prelayernorm --split-lines True --learn-positional-embeddings True \
--n-layers 12 --embedding-size 1024 --ffn-size 4096 --n-heads 16 --n-decoder-layers 12 \
--dict-tokenizer gpt2 --generation-model bart \
--query-model bert_from_parlai_rag \
--rag-model-type token --rag-retriever-type search_engine --search_server None \
--dpr-model-file zoo:hallucination/bart_rag_token/model \
--gold-document-titles-key select-docs-titles --insert-gold-docs True \
--beam-min-length 5 --beam-context-block-ngram 3 --beam-block-ngram 3 --beam-block-full-context False --beam-size 3 \
--inference beam --optimizer mem_eff_adam --learningrate 1e-05 --lr-scheduler-patience 1 --model-parallel True \
--knowledge-access-method memory_only --batchsize 16 \
--truncate 512 --text-truncate 512 --label-truncate 128 \
--dropout 0.0 --attention-dropout 0.0 \
--min-doc-token-length 64 --max-doc-token-length 256 \
--fp16 True --fp16-impl mem_efficient --force-fp16-tokens True \
--model-file /projects/ParlAI/models/odkg_model
It gives the following error:
ImportError: Could not find pretrained model in parlai.zoo.blenderbot2.blenderbot2_90M or parlai.zoo.blenderbot2.build. Please check your spelling and make sure you've pulled from master.
When I changed the --init-model zoo:blender/blender_90M/model --dict-file zoo:blender/blender_90M/model.dict \
, then it gave me the following error.
size mismatch for embeddings.weight: copying a param with shape torch.Size([54944, 512]) from checkpoint, the shape in current model is torch.Size([99240, 1024]).
-----------------
Could not load the model due to a size mismatch in the embeddings. A common reason for this is trying to load a model trained with fp16 but loaded without fp16. Try adding --fp16 true or --force-fp16-tokens true.
The only way I was able to run the model is using the exact same script with zoo:blenderbot2/blenderbot2_400M/model
.
Following is the full script I run to train the model.
parlai train_model -dp data \
--model projects.blenderbot2.agents.blenderbot2:BlenderBot2FidAgent \
--task fromfile:parlaiformat --fromfile_datapath dataset/archi_parlai.txt \
--num_epochs 20 \
--memory-decoder-model-file "" --memory-key full_text \
--search-query-generator-model-file zoo:blenderbot2/query_generator/model --search-query-generator-beam-min-length 2 \
--save-every-n-secs 600 --validation_every_n_secs 600 --log_every_n_secs 60 \
--init-model zoo:blenderbot2/blenderbot2_400M/model --dict-file zoo:blenderbot2/blenderbot2_400M/model.dict \
--datatype train:stream \
--embeddings-scale True --variant prelayernorm --split-lines True --learn-positional-embeddings True \
--n-layers 12 --embedding-size 1024 --ffn-size 4096 --n-heads 16 --n-decoder-layers 12 \
--dict-tokenizer gpt2 --generation-model bart \
--query-model bert_from_parlai_rag \
--rag-model-type token --rag-retriever-type search_engine --search_server None \
--dpr-model-file zoo:hallucination/bart_rag_token/model \
--gold-document-titles-key select-docs-titles --insert-gold-docs True \
--beam-min-length 5 --beam-context-block-ngram 3 --beam-block-ngram 3 --beam-block-full-context False --beam-size 3 \
--inference beam --optimizer mem_eff_adam --learningrate 1e-05 --lr-scheduler-patience 1 --model-parallel True \
--knowledge-access-method memory_only --batchsize 1 \
--truncate 512 --text-truncate 512 --label-truncate 128 \
--dropout 0.0 --attention-dropout 0.0 \
--min-doc-token-length 64 --max-doc-token-length 256 \
--fp16 True --fp16-impl mem_efficient --force-fp16-tokens True \
--model-file projects/ParlAI/odkg_model
and the data format was:
text: Tell me a joke.[TAB]labels: Let me tell you my favorite joke. Why was six afraid of seven? Because seven ate nine![TAB]episode_done: True[TAB]personas: your persona: I am John.|your persona: I live in Ohio.[NEW_LINE]
The model trained successfully. However, the model was not able to respond based on bot personas. In my case (single-turn conversation), for the same question, there are different responses for different bots based on persona. The confusion I have here is that,
partner1
is a user with persona.partner2
is a bot with persona.
input:
your persona: I am John.\n
your persona: I live in Ohio.[\n
Tell me a joke.
response: Let me tell you my favorite joke. Why was six afraid of seven? Because seven ate nine!
So here are my few more questions.
1. Please check the above command and let me know what is the best way to train the model with appropriate parameters.
2. Do you think the model is able to train and learn to response differently for same question based on personas. If so, could you please guide me how to achieve this?
Thanks...
--memory-key full_text
, then you'll want to prepend the persona lines to the text in your text
field; if you're multitasking with BST and convai2, I would recommend doing that so that it's consistent across datasets--memory-decoder-model-file '' --query-generator-model-file '' --generation-model transformer/generator --knowledge-access-method memory_only
Hi,
Thanks for your responses and guidance. I am able to train the BB2 model on custom dataset with personas context by adding --memory-key full_text. However, In my case the model with 90m doesn't work (may be some parameters issues.)
Thanks!
Hi,
I have another question regarding the dataset. Previously, I added personas for response generation. Apart from persona, how can we add the dialog history in the dataset..?
For example in my current dataset given below,
history
key with few previous dialogs in the conversation sequence? and projects.blenderbot2.agents.blenderbot2:BlenderBot2FidAgent
model will also learn the history information?
{"dialog": [[{"id": "partner1", "text": "your persona: I am John.\nyour persona: I live in Ohio.\ntell me a joke."}, {"id": "partner2", "text": "One time, I put strawberry jam on my burger. I thought it was ketchup!"}]]}
{"dialog": [[{"id": "partner1", "text": "your persona: I am Ellie.\nyour persona: I live in New York.\ntell me a joke."}, {"id": "partner2", "text": "Let me tell you my favorite joke. Why was six afraid of seven? Because seven ate nine!"}]]}
Thanks
so the dialog
key is a list of messages within a conversational "episode", as we call it in ParlAI. The agent will remember automatically all previous conversation within a given episode, and will reset that history after the final example within the episode is shown.
Alternatively, you could also have each entry in your dataset be a "flattened" version of the episode, where you include all the previous utterances as delimited messages in the "text" field.
Adding a history
key will not do anything at the moment.
This issue has not had activity in 30 days. Please feel free to reopen if you have more issues. You may apply the "never-stale" tag to prevent this from happening.
Hi,
I have another question regarding the dataset. Previously, I added personas for response generation. Apart from persona, how can we add the dialog history in the dataset..?
For example in my current dataset given below,
- Can we add the
history
key with few previous dialogs in the conversation sequence? and- Whether the
projects.blenderbot2.agents.blenderbot2:BlenderBot2FidAgent
model will also learn the history information?{"dialog": [[{"id": "partner1", "text": "your persona: I am John.\nyour persona: I live in Ohio.\ntell me a joke."}, {"id": "partner2", "text": "One time, I put strawberry jam on my burger. I thought it was ketchup!"}]]} {"dialog": [[{"id": "partner1", "text": "your persona: I am Ellie.\nyour persona: I live in New York.\ntell me a joke."}, {"id": "partner2", "text": "Let me tell you my favorite joke. Why was six afraid of seven? Because seven ate nine!"}]]}
Thanks
Hi @klshuster ,
I have another query regarding the dataset. With the above dataset, how can we add the context of the conversation or episode.
For example: The context could be conversation topic: talk about movies
, talk about work
, talk about travel
etc.
So that the conversation would strictly follow the topic and generate topic related dialog responses.
Here is an example work: personaGPT
One way I found is making the dataset similar to the blended_skill_tak, but with this settings I have a few queries.
1- Can I add the topic in Wizard of Wikipedia topic:
key?
2- If so, then how can we query the model in inferencing time,
blender_agent.observe({'text': bot_input, "personas": npc_agent_list[npc], 'episode_done': True})
*_persona prepending in the botinput
blender_agent.observe({'text': bot_input, 'episode_done': True})
Should we use the same Wizard of Wikipedia topic:
key to add the topic in query during inferencing?
Thanks!
If you add specialized keys beyond text
, or the ones that BB2 normally uses, they will be ignored by the model. Your best bet is to either prepend the topic to the given text, or override the agent's observe
function to specially process any keys you add
Hi @klshuster
Thank you so much for the response.
I have prepend the topic
tag in the text similar to your persona
tag just before the input query.
Do you think the model will be able to learn in this way...?
{"dialog": [[{"id": "partner1", "text": "your persona: I am John.\nyour persona: I live in Ohio.\ntopic: talk about movies\ntell me a joke."}, {"id": "partner2", "text": "One time, I put strawberry jam on my burger. I thought it was ketchup!"}]]}
{"dialog": [[{"id": "partner1", "text": "your persona: I am Ellie.\nyour persona: I live in New York.\ntopic: talk about movies\ntell me a joke."}, {"id": "partner2", "text": "Let me tell you my favorite joke. Why was six afraid of seven? Because seven ate nine!"}]]}
Thanks!
Do you think the model will be able to learn in this way...?
if you have suitable training data, I would guess it'll learn something. Only one way to find out!
if you have suitable training data, I would guess it'll learn something. Only one way to find out!
I tried this approach. The model seems learning from the context and topics. However, it fails to handle response for any random negative input query.
For example,
PC
and BOT
. The model seems learning and recognizing the topic contexts. However, for a given input, whatever the PC input is (positive or negative - random), it always generates the output response based on the script.
Since the data does not have negative samples, the model seems learning only positive responses.
I want to train the BB2 model on custom dataset with bot persona. I have the dataset in
question-answer
dialog format.I prepared ParlAI format dataset using the guidance given here.
I am able to generate the task from the JSON file by following the instruction. However, I am curious, how can I add persona context as the format described in the guide only contains
input-response
pair.For example:
{"id": "partner1", "text": "hello how are you today?"}, {"id": "partner2", "text": "i'm great thanks! what are you doing?"}
PS. I want to add persona information because each bot suppose to generate different responses for the same question based on their persona context.
Thanks.