thunlp / KnowledgeablePromptTuning

kpt code
206 stars 21 forks source link

Received a fatal error!! #16

Open znsoftm opened 2 years ago

znsoftm commented 2 years ago

test fewshot.py, get the below fatal error.

pytorch 1.10.0

for step, inputs in enumerate(train_dataloader):
            if use_cuda:
                inputs = inputs.cuda()
            logits = prompt_model(inputs)
            labels = inputs['label']
            loss = loss_func(logits, labels)
            loss.backward()  # it causes the fatal error.
            torch.nn.utils.clip_grad_norm_(prompt_model.parameters(), 1.0)

            tot_loss += loss.item()

            optimizer1.step()

one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [20, 1076]], which is output 0 of SoftmaxBackward0, is at version 1; expected version 0 instead. Hint: the backtrace further above shows the operation that failed to compute its gradient. The variable in question was changed in there or anywhere later. Good luck!

znsoftm commented 2 years ago

Browsed much materials, pytorch 1.4 or below can support it. wish somebody can fix it.

ShengdingHu commented 2 years ago

I run on pytorch 1.9.0. But it is strange that it requires 1.4 or below to solve the bug. Can you start a new virtual environment and test it?

znsoftm commented 2 years ago

Is it correct on pytorch 1.9.0? To create a virtual env is difficult. Pytorch 1.4 or below is not compatible with cuda 11.7. Guess need to downgrade my cuda? (not cudatoolkit)

znsoftm commented 2 years ago

After verification, pytorch 1.9 does not work! pytorch 1.9, cuda 11.7, python 3.8. nvidia 3090

znsoftm commented 2 years ago

Maybe you guys should fix it in openprompt.

znsoftm commented 2 years ago

For torch 1.4, we found that transofrmers complains torch>=1.5.0 is required for a normal functioning of this module, but found torch==1.4.0+cu92. Could you please tell us what configuration you are using ?

znsoftm commented 2 years ago

openprompt 1.0.1 pytorch 1.9, cuda 11.6/7 for fewshot.py, it doesn't work. For pytorch 1.4, it is impossible, because transformer 4.2 needs the version 1.5 of pytorch. please verify the combination: pytorch 1.9 or above, cuda 11.6/7 openprompt 1.01 ( up-to-date)

Knightzhr commented 1 year ago

can u run fewshot.py? i have the same error

znsoftm commented 1 year ago

No, we can not.

BaoZi-chu commented 1 year ago

I had the same issue, did anyone fix it