guxd / C-DNPG

Data and code for the paper "Continuous Decomposition of Granularity for Neural Paraphrase Generation"
Other
8 stars 1 forks source link

Running inference #3

Open JohnMe opened 1 year ago

JohnMe commented 1 year ago

Thanks for open sourcing the code. I managed to train a model, however I can not run inference on a simple sentence, since it seems I am not correctly calling the model. Would be glad If you could add a simple example how to run basic inference...

model = GATransformer(config) model.from_pretrained(modelDir) model.to(device) sentence = "Some cool sentence..." inputs = model.tokenizer.encode(sentence, return_tensors="pt") outputs = model(inputs) print('outputs', outputs)

TypeError: forward() missing 2 required positional arguments: 'context_attn_mask' and 'response'

guxd commented 1 year ago

A simple illustration (needs bug fix)

model = GATransformer(config)
model.from_pretrained(modelDir)
model.to(device)
sentence = "Some cool sentence..."
inputs = model.tokenizer.encode(sentence, return_tensors="pt")
if inputs[0]!=model.tokenizer.cls_token_id: inputs = [model.tokenizer.cls_token_id] + inputs
if inputs[-1]!=model.tokenizer.sep_token_id: inputs = inputs + [model.tokenizer.sep_token_id]
inputs = inputs.unsqueeze(0) # a new dim of batch
seq_len = min(len(inputs), args.src_maxlen)
attn_mask=torch.ones([1, seq_len], datatype=torch.long)
ground_truth = torch.zeros([1, args.tar_maxlen], data_type=torch.long) # create a dummy ground_truth paraphrase to adapt to the existing code. 
batch = [inputs, attn_mask, ground_truth]
sample_words, sample_lens, source, gt_paraphrase, granularity = model.generate(batch, max_len=args.tar_maxlen, beam_size = args.beam_size)
print('outputs', sample_words)
JohnMe commented 1 year ago

Hello guxd, thanks a lot for the hint. With some changes I managed to run the model, but it seems the input is not exactly correct

model.tokenizer.cls_token_id 101 model.tokenizer.sep_token_id 102 inputs tensor([[ 101, 2070, 4658, 6251, 1012, 1012, 1012, 102]]) len(inputs) 8 seq_len 8

But this results in later step (def forward) into:

input_ids tensor([[101], [101], [101], [101], [101], [101], [101], [101]]) position_ids None token_type_ids tensor([[0], [0], [0], [0], [0], [0], [0], [0]]) inputs_embeds None

Which obviously gives wrong results in the end...

I guess there is something wrong in the dimensions of the tensors above?

guxd commented 1 year ago

Please add special tokens (CLS, SEP) in the tokenizer.encode(...) function. I have deleted the two statements that manually add these two special tokens.

model = GATransformer(config)
model.from_pretrained(modelDir)
model.to(device)
sentence = "Some cool sentence..."
inputs = model.tokenizer.encode(sentence, add_special_tokens=True, return_tensors="pt")
inputs = inputs.unsqueeze(0) # a new dim of batch
seq_len = min(inputs.size(1), args.src_maxlen)
attn_mask=torch.ones([1, seq_len], datatype=torch.long)
ground_truth = torch.zeros([1, args.tar_maxlen], data_type=torch.long) 
batch = [inputs, attn_mask, ground_truth]
sample_words, sample_lens, source, gt_paraphrase, granularity = model.generate(batch, max_len=args.tar_maxlen, beam_size = args.beam_size)
print('outputs', sample_words)
JohnMe commented 1 year ago

I have to change the following lines in your example, than it atleast proceeds

attn_mask=torch.ones([1, seq_len], dtype=torch.long) ground_truth = torch.zeros([1, config.tar_maxlen], dtype=torch.long)

But it seems you are using a different torch version? Please let me know what are you using, since it is not in the requirements.yaml.

Currently i use torch version: 1.11.0+cu102

But it crashes:

input_batch[:2] [tensor([[[ 101, 2070, 4658, 6251, 1012, 1012, 1012, 102]]]), tensor([[1]])] context: tensor([[[ 101, 2070, 4658, 6251, 1012, 1012, 1012, 102]]]) Traceback (most recent call last): File "/mnt/8tb/para-models/C-DNPG/try2.py", line 42, in sample_words, sample_lens, source, gt_paraphrase, granularity = model.generate(batch, max_len=config.tar_maxlen, beam_size = config.beam_size) File "/mnt/8tb/para-models/C-DNPG/model.py", line 730, in generate batch_size, max_ctx_len = context.size() ValueError: too many values to unpack (expected 2)

The part with " tensor([[1]])" seems also bogus to me..

Thanks a lot

guxd commented 1 year ago

The dimensionality of context is 3 instead of 2, so you can try to remove inputs = inputs.unsqueeze(0) in my example.

guxd commented 1 year ago

I used pytorch1.3 in my experiments.

JohnMe commented 1 year ago

Ok, I'll try with 1.3 then - however that means I have to retrain since my model was trained in 1.11 and can not loaded in correctly from 1.3

Trying the run with 1.11 produced only

Result: [CLS]tology floodsiferous oak festivals 1766 insult [unused190]aged containerzekle magnificentote 1782 chrysler defending hunan £10

sentence = "How do I book a flight to miami?"

seq_len = min(len(inputs[0]), config.src_maxlen) attn_mask=torch.ones([1, seq_len], dtype=torch.long)


context tensor([[ 101, 2129, 2079, 1045, 2338, 1037, 3462, 2000, 5631, 1029, 102]]) context_attn_mask tensor([[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]]) input_ids tensor([[ 101, 2129, 2079, 1045, 2338, 1037, 3462, 2000, 5631, 1029, 102]])

guxd commented 1 year ago

It seems that tokenization works well in the current example. You can also try to adapt my example to 1.11 and then run the inference without re-training

JohnMe commented 1 year ago

Unfortunately above is the result of running it in 1.11. Training was also done in 1.11 until the end.

python main.py --dataset quora --model GATransformer --model_size dnpg-default --per_gpu_train_batch_size 32 --learning_rate 5e-5 --src_maxlen 20 --tar_maxlen 20 --beam_size 8 --max_steps 400000 --validating_steps 5000 --start_eval 20000

guxd commented 1 year ago

How about the metric values? Are they normal? Did you check the generated paraphrases in the output/results directory?

JohnMe commented 1 year ago

The generated sentences seem fine, also it seems it is mostly copying...

Sample 19998:
Context 0 >> how does a long distance relationship work? [SEP] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD]
Target >> how are long distance relationships maintained?
Generated 0 >> : how does a long distance relationship work?

Sample 19999:
Context 0 >> what does jainism say about homosexuality? [SEP] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD]
Target >> what does jainism say about gays and homosexuality?
Generated 0 >> : what does jainism say about homosexuality?

Sample 20000:
Context 0 >> do you believe there is life after death? [SEP] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD]
Target >> is it true that there is life after death?
Generated 0 >> : what is life after death?

avg_len = 12.1136 bleu2 = 0.49570343748313955 bleu4 = 0.2786568019299539 ibleu = 0.22155279535959968 meteor = 0.5430145044918927 perplexity = 4.391942977905273 rouge-L = 0.4813490605353764