microsoft / CodeXGLUE

CodeXGLUE
MIT License
1.57k stars 366 forks source link

error while testing code completion #74

Closed saichandrapandraju closed 3 years ago

saichandrapandraju commented 3 years ago

Hi,

I was trying code-completion-line and was successfully able to train the model(codeGPT-java). But while inferencing, getting below error - TypeError: tuple indices must be integers or slices, not tuple.

Here is the full trace -

Traceback (most recent call last):
  File "run_lm.py", line 656, in <module>
    main()
  File "run_lm.py", line 652, in main
    eval_line_completion(args, model, tokenizer, file_type="test")
  File "run_lm.py", line 381, in eval_line_completion
    past_hidden = [x[:, i:i+1].expand(-1, beam_size, -1, -1, -1) for x in outputs]
  File "run_lm.py", line 381, in <listcomp>
    past_hidden = [x[:, i:i+1].expand(-1, beam_size, -1, -1, -1) for x in outputs]
TypeError: tuple indices must be integers or slices, not tuple

plz suggest how to proceed further..

skye95git commented 3 years ago

Hi,

I was trying code-completion-line and was successfully able to train the model(codeGPT-java). But while inferencing, getting below error - TypeError: tuple indices must be integers or slices, not tuple.

Here is the full trace -

Traceback (most recent call last):
  File "run_lm.py", line 656, in <module>
    main()
  File "run_lm.py", line 652, in main
    eval_line_completion(args, model, tokenizer, file_type="test")
  File "run_lm.py", line 381, in eval_line_completion
    past_hidden = [x[:, i:i+1].expand(-1, beam_size, -1, -1, -1) for x in outputs]
  File "run_lm.py", line 381, in <listcomp>
    past_hidden = [x[:, i:i+1].expand(-1, beam_size, -1, -1, -1) for x in outputs]
TypeError: tuple indices must be integers or slices, not tuple

plz suggest how to proceed further..

Hi, I meet the same problem with you. Do you have solved it?

saichandrapandraju commented 3 years ago

No @skye95git , still waiting for resolution..

celbree commented 3 years ago

Hi @saichandrapandraju and @skye95git , Huggingface has changed GPT-2 model's output in their newer version of transformers, which causes this issue. I suggest you downgrade transformers to 3.3.0 to run our codes. We are also planning to adapt our codes to the newest version of transformers in the near future.

LIANGQINGYUAN commented 2 years ago

I noticed a slight difference in the data returned by GPT2 in 3.3.0 and the latest version (4.13.0) in transformers.

We need past_key_values as the past_hidden tensors to compute next token in code.

In 3.3.0, the type of past_key_values is List[torch.FloatTensor], the shape of each element in List is (2, batch_size, num_heads, sequence_length, embed_size_per_head). We have n_layers elements in total.

3 3 0

In 4.13.0, the type of past_key_values is tuple(tuple(torch.FloatTensor), we also got n_layers elements in the first tuple($tuple_1$). Each element in the $tuple_1$ also is a tuple($tuple_2$), and $tuple_2$ has two tensors in the shape of (batch_size, num_heads, sequence_length, embed_size_per_head).

4 13 0

This means that if we concat two tensors of $tuple_2$ in the 0th dimension, we will get the same effect as in 3.3.0.

The code changes as follows:

old code:

past_hidden = [x[:, i:i+1].expand(-1, beam_size, -1, -1, -1) for x in outputs]

new code:

past = [torch.cat([x[0].unsqueeze(0),x[1].unsqueeze(0)],dim=0) if type(x)==tuple else x for x in outputs]
past_hidden = [x.data.index_select(1, beam.getCurrentOrigin()) for x in past]

This code is compatible with versions 3.3.0 and and 4.13.0.