RunTimeError in "run_summarization": expected device cuda:0 and dtype byte but got device cuda: 0 and dtype Bool

junxu-ai commented 4 years ago

🐛 Bug

Model I am using (Bert, XLNet....): bertabs-finetuned-cnndm-extractive-abstractive-summarization-pytorch_model

Language I am using the model on (, Chinese....): English

The problem arise when using:

[ ] the official example scripts: (give details) run_summarization.py in examples

The tasks I am working on is:

[ ] an official GLUE/SQUaD task: summarization

To Reproduce

Steps to reproduce the behavior:

python run_summarization.py --documents_dir .\data --sumaries_output_dir .\output

Expected behavior

Environment

OS: Win10
Python version: 3.6.1
PyTorch version:1.2.0
PyTorch Transformers version (or branch): master at 24 Dec
Using GPU ? Yes
Distributed of parallel setup ? No
Any other relevant information:

Additional context

errors come from modeling_berabs.py, line 328, in forward

junxu-ai commented 4 years ago

Someone suggested to use Pytorch v1.1.0 instead of 1.2.0. But not sure if it is ok.

junxu-ai commented 4 years ago

it's a version inconsistent issue. in ver 1.1.0, torch.gt outputs: torch.gt(torch.tensor([[1, 2], [3, 4]]), torch.tensor([[1, 1], [4, 4]])) tensor([[ 0, 1], [ 0, 0]], dtype=torch.uint8) while in ver 1.2.0, it outputs:

torch.ge(torch.tensor([[1, 2], [3, 4]]), torch.tensor([[1, 1], [4, 4]])) tensor([[True, True], [False, True]])

junxu-ai commented 4 years ago

Please see the pull request #2369

huggingface / transformers