Closed josharnoldjosh closed 4 years ago
BlenderBot is a TransformerGeneratorAgent, so you don't really need to override anything. You just need to create it from the model file. There's a utility, parlai.agents.create_agent_from_model_file
that you can use.
bb = create_agent_from_model_file("data/models/blender/blender_90M/model")
@stephenroller thanks for the fast response! I was hoping to override potentially reorder_decoder_incremental_state
, reorder_encoder_states
, and compute_loss
for example, as I am hoping to apply reinforcement learning to fine-tune blender eventually. In this case, how would I go about overriding specific instances of the Blender model class while also getting it to work, as the act()
function doesn't seem to work by simply subclassing TransformerGeneratorAgent
and loading the blender opt
? Thanks a lot for your time
Oh I see. Subclass TransformerGeneratorAgent and use all the same arguments like you're fine-tuning, except for the argument to --model/-m
.
Depending on what you add, you might want to change load_state_dict
to use strict=False
.
Thanks a bunch!
Sorry, don't mean to bother you @stephenroller but I think I'm making a mistake somewhere, I've tried using the arguments from the ParlAI Recipies page:
"parlai train_model -t blended_skill_talk,wizard_of_wikipedia,convai2:normalized -m transformer/generator --multitask-weights 1,3,3,3 --init-model zoo:tutorial_transformer_generator/model --dict-file zoo:tutorial_transformer_generator/model.dict --embedding-size 512 --n-layers 8 --ffn-size 2048 --dropout 0.1 --n-heads 16 --learn-positional-embeddings True --n-positions 512 --variant xlm --activation gelu --skip-generation True --fp16 True --text-truncate 512 --label-truncate 128 --dict-tokenizer bpe --dict-lower True -lr 1e-06 --optimizer adamax --lr-scheduler reduceonplateau --gradient-clip 0.1 -veps 0.25 --betas 0.9,0.999 --update-freq 1 --attention-dropout 0.0 --relu-dropout 0.0 --skip-generation True -vp 15 -stim 60 -vme 20000 -bs 16 -vmt ppl -vmm min --save-after-valid True --model-file /tmp/test_train_90M"
I parsed them like so:
opt = {
"no_cuda": True,
"task": "internal:blended_skill_talk,wizard_of_wikipedia,convai2,empathetic_dialogues",
"multitask_weights": [
1.0,
3.0,
3.0,
3.0
],
"init_model": "zoo:tutorial_transformer_generator/model",
"dict_file":"zoo:tutorial_transformer_generator/model.dict",
"embedding_size": 512,
"n_layers": 8,
"ffn_size": 2048,
"dropout": 0.1,
"n_heads": 16,
"learn_positional_embeddings": True,
"n_positions": 512,
'variant': 'xlm',
'activation': 'gelu',
'skip_generation': True,
'fp16': True,
'text-truncate': 512,
'label_truncate': 128,
'dict_tokenizer': 'bpe',
'dict_lower': True,
'lr': 1e-06,
'optimizer': 'adamax',
'lr_scheduler': 'reduceonplateau',
'gradient_clip': 0.1,
'veps': 0.25,
"betas": [
0.9,
0.999
],
"update_freq": 1,
"attention_dropout": 0.0,
"relu_dropout": 0.0,
"skip_generation": True,
'vp': 15,
'stim': 60,
'vme': 20000,
'bs': 16,
'vmt': 'ppl',
'vmm': 'min',
'save_after_valid': True,
'model_file': '/tmp/test_train_90M',
}
And the following code gives me the error:
blender = Blender(opt=opt)
blender.reset()
blender.observe({'text': 'Hello', 'episode_done': False})
result = blender.act()
Traceback (most recent call last):
File "/Volumes/GoogleDrive/My Drive/Research (Drive)/Paper 3 (Drive)/Code/ParlAI/rl_pipeline/agent.py", line 82, in <module>
blender = Blender(opt=opt)
File "/Volumes/GoogleDrive/My Drive/Research (Drive)/Paper 3 (Drive)/Code/ParlAI/rl_pipeline/agent.py", line 30, in __init__
super().__init__(opt, shared)
File "/Volumes/GoogleDrive/My Drive/Research (Drive)/Paper 3 (Drive)/Code/ParlAI/parlai/core/torch_generator_agent.py", line 445, in __init__
super().__init__(opt, shared)
File "/Volumes/GoogleDrive/My Drive/Research (Drive)/Paper 3 (Drive)/Code/ParlAI/parlai/core/torch_agent.py", line 728, in __init__
self.dict = self.build_dictionary()
File "/Volumes/GoogleDrive/My Drive/Research (Drive)/Paper 3 (Drive)/Code/ParlAI/parlai/core/torch_agent.py", line 812, in build_dictionary
d = self.dictionary_class()(self.opt)
File "/Volumes/GoogleDrive/My Drive/Research (Drive)/Paper 3 (Drive)/Code/ParlAI/parlai/core/dict.py", line 272, in __init__
opt['dict_file'] = modelzoo_path(opt.get('datapath'), opt['dict_file'])
File "/Volumes/GoogleDrive/My Drive/Research (Drive)/Paper 3 (Drive)/Code/ParlAI/parlai/core/build_data.py", line 424, in modelzoo_path
my_module.download(datapath)
File "/Volumes/GoogleDrive/My Drive/Research (Drive)/Paper 3 (Drive)/Code/ParlAI/parlai/zoo/tutorial_transformer_generator/build.py", line 20, in download
mdir = os.path.join(get_model_dir(datapath), model_name)
File "/Volumes/GoogleDrive/My Drive/Research (Drive)/Paper 3 (Drive)/Code/ParlAI/parlai/core/build_data.py", line 349, in get_model_dir
return os.path.join(datapath, 'models')
File "/Users/josharnold/opt/anaconda3/envs/biases/lib/python3.7/posixpath.py", line 80, in join
a = os.fspath(a)
**TypeError: expected str, bytes or os.PathLike object, not NoneType**
Can't I just load directly model.dict.opt
or model.opt
as the opt instead?
Loading the file directly:
opt = "./data/models/blender/blender_90M/model.dict.opt"
opt = json.load(open(opt, 'r'))
Gives me the following error:
Traceback (most recent call last):
File "/Volumes/GoogleDrive/My Drive/Research (Drive)/Paper 3 (Drive)/Code/ParlAI/rl_pipeline/agent.py", line 37, in <module>
result = blender.act()
File "/Volumes/GoogleDrive/My Drive/Research (Drive)/Paper 3 (Drive)/Code/ParlAI/parlai/core/torch_agent.py", line 1929, in act
self.self_observe(response)
File "/Volumes/GoogleDrive/My Drive/Research (Drive)/Paper 3 (Drive)/Code/ParlAI/parlai/core/torch_agent.py", line 1747, in self_observe
last_reply = self_message['text']
**KeyError: 'text'**
And loading just the model.opt
gives me:
opt = "./data/models/blender/blender_90M/model.opt"
opt = json.load(open(opt, 'r'))
Traceback (most recent call last):
File "/Volumes/GoogleDrive/My Drive/Research (Drive)/Paper 3 (Drive)/Code/ParlAI/rl_pipeline/agent.py", line 37, in <module>
result = blender.act()
File "/Volumes/GoogleDrive/My Drive/Research (Drive)/Paper 3 (Drive)/Code/ParlAI/parlai/core/torch_agent.py", line 1928, in act
response = self.batch_act([self.observation])[0]
File "/Volumes/GoogleDrive/My Drive/Research (Drive)/Paper 3 (Drive)/Code/ParlAI/parlai/core/torch_agent.py", line 1978, in batch_act
output = self.eval_step(batch)
File "/Volumes/GoogleDrive/My Drive/Research (Drive)/Paper 3 (Drive)/Code/ParlAI/parlai/core/torch_generator_agent.py", line 860, in eval_step
beam_preds_scores, _ = self._generate(batch, self.beam_size, maxlen)
File "/Volumes/GoogleDrive/My Drive/Research (Drive)/Paper 3 (Drive)/Code/ParlAI/parlai/core/torch_generator_agent.py", line 1085, in _generate
b.advance(score[i])
File "/Volumes/GoogleDrive/My Drive/Research (Drive)/Paper 3 (Drive)/Code/ParlAI/parlai/core/torch_generator_agent.py", line 1351, in advance
logprobs, self.scores, current_length
File "/Volumes/GoogleDrive/My Drive/Research (Drive)/Paper 3 (Drive)/Code/ParlAI/parlai/core/torch_generator_agent.py", line 1532, in select_paths
best_scores, best_idxs = torch.topk(flat_beam_scores, self.beam_size, dim=-1)
**RuntimeError: selected index k out of range**
Looks like you need to give a datapath value in opt, which is a directory where it is safe to download models, Data etc.
Otherwise your excerpts seem okay from a quick glance
Thank you for the fast response @stephenroller. It's super weird, but I added in datapath and I'm still getting an issue. I think I must be missing something else right?
class Blender(TransformerGeneratorAgent):
def __init__(self, opt: Opt = None, shared: TShared = None):
# if not opt:
# opt = "./data/models/blender/blender_90M/model.opt"
# opt = json.load(open(opt, 'r'))
# opt['data_path'] = './custom/data/'
# print(opt)
super().__init__(opt, shared)
opt = {
"no_cuda": True,
"task": "internal:blended_skill_talk,wizard_of_wikipedia,convai2,empathetic_dialogues",
"multitask_weights": [
1.0,
3.0,
3.0,
3.0
],
"init_model": "zoo:tutorial_transformer_generator/model",
"dict_file":"zoo:tutorial_transformer_generator/model.dict",
"embedding_size": 512,
"n_layers": 8,
"ffn_size": 2048,
"dropout": 0.1,
"n_heads": 16,
"learn_positional_embeddings": True,
"n_positions": 512,
'variant': 'xlm',
'activation': 'gelu',
'skip_generation': True,
'fp16': True,
'text-truncate': 512,
'label_truncate': 128,
'dict_tokenizer': 'bpe',
'dict_lower': True,
'lr': 1e-06,
'optimizer': 'adamax',
'lr_scheduler': 'reduceonplateau',
'gradient_clip': 0.1,
'veps': 0.25,
"betas": [
0.9,
0.999
],
"update_freq": 1,
"attention_dropout": 0.0,
"relu_dropout": 0.0,
"skip_generation": True,
'vp': 15,
'stim': 60,
'vme': 20000,
'bs': 16,
'vmt': 'ppl',
'vmm': 'min',
'save_after_valid': True,
'model_file': '/tmp/test_train_90M',
'datapath': './custom/data/',
'history_size': -1,
'truncate': -1,
'rank_candidates': False,
'embeddings_scale': True,
'output_scaling': 1.0,
'embedding_type': 'random',
'gpu': -1
}
blender = Blender(opt=opt)
blender.reset()
blender.observe({'text': 'Hello', 'episode_done': False})
result = blender.act()
Error:
Traceback (most recent call last):
File "/Users/josharnold/Desktop/ParlAI/custom/blender_test.py", line 94, in <module>
result = blender.act()
File "/Volumes/GoogleDrive/My Drive/Research (Drive)/Paper 3 (Drive)/Code/ParlAI/parlai/core/torch_agent.py", line 1929, in act
self.self_observe(response)
File "/Volumes/GoogleDrive/My Drive/Research (Drive)/Paper 3 (Drive)/Code/ParlAI/parlai/core/torch_agent.py", line 1747, in self_observe
last_reply = self_message['text']
KeyError: 'text'
Ah swap skip generation to False
Thanks a bunch! My final opt is:
opt = {
"no_cuda": True,
"task": "internal:blended_skill_talk,wizard_of_wikipedia,convai2,empathetic_dialogues",
"multitask_weights": [
1.0,
3.0,
3.0,
3.0
],
"init_model": "./data/models/blender/blender_90M/model",
"dict_file":"./data/models/blender/blender_90M/model.dict",
"embedding_size": 512,
"n_layers": 8,
"ffn_size": 2048,
"dropout": 0.1,
"n_heads": 16,
"learn_positional_embeddings": True,
"n_positions": 512,
'variant': 'xlm',
'activation': 'gelu',
'skip_generation': True,
'fp16': True,
'text-truncate': 512,
'label_truncate': 128,
'dict_tokenizer': 'bpe',
'dict_lower': True,
'lr': 1e-06,
'optimizer': 'adamax',
'lr_scheduler': 'reduceonplateau',
'gradient_clip': 0.1,
'veps': 0.25,
"betas": [
0.9,
0.999
],
"update_freq": 1,
"attention_dropout": 0.0,
"relu_dropout": 0.0,
"skip_generation": False,
'vp': 15,
'stim': 60,
'vme': 20000,
'bs': 16,
'vmt': 'ppl',
'vmm': 'min',
'save_after_valid': True,
'model_file': '/tmp/test_train_90M',
'datapath': './custom/data/',
'history_size': -1,
'truncate': -1,
'rank_candidates': False,
'embeddings_scale': True,
'output_scaling': 1.0,
'embedding_type': 'random',
'gpu': -1
}
Thanks for posting for posterity.
Hello,
I'm trying to subclass
TransformerGeneratorAgent
to use with the Blender model, but I'm not sure what other functions I need to override..?My final goal is to tweak somethings to get reinforcement learning working with Blender.
This is what I have so far:
What might I be missing? I've been trying to look at this tutorial here: https://colab.research.google.com/drive/1bRMvN0lGXaTF5fuTidgvlAl-Lb41F7AD#scrollTo=hVrZh-T903wh
Thanks!