openai / gpt-2-output-dataset

Dataset of GPT-2 outputs for research in detection, biases, and more
MIT License
1.93k stars 548 forks source link

RunTimeError: Error(s) in loading state_dict for RobertaForSequenceClassification #28

Open greenunknown opened 3 years ago

greenunknown commented 3 years ago

I get the following error after trying to run

pip install -r requirements.txt
python -m detector.server detector-base.pt

Error:

RuntimeError: Error(s) in loading state_dict for RobertaForSequenceClassification:
        Missing key(s) in state_dict: "roberta.embeddings.position_ids".
        Unexpected key(s) in state_dict: "roberta.pooler.dense.weight", "roberta.pooler.dense.bias"

I'm not sure if there was a change between the google version of the Roberta weights or the azure version of the weights.

Thanks for the help!

jongwook commented 3 years ago

Hi, it's our bad that we didn't properly specify the dependency versions. Could you try with transformer==2.9.1 and see if that loads properly?

greenunknown commented 3 years ago

Thank you @jongwook ! Changing, in the requirements.txt, the transformers>=2.0.0 to transformers==2.9.1 resolved the issue.

BelowzeroA commented 3 years ago

Following your advice I installed transformers==2.9.1 but after that the next issue popped up:

from transformers import RobertaForSequenceClassification, RobertaTokenizer
ImportError: cannot import name 'RobertaForSequenceClassification' from 'transformers'
crazoter commented 3 years ago

@BelowzeroA Did you manage to resolve this? The fix worked for me on Google Colab, but I encountered your issue when I tried the same fix on a fresh anaconda environment on a Windows 10 device.

Edit: scratch that, I had a typo in the component name. Fixing that resolved my problem.

Yorko commented 2 years ago

Made it work with Python 3.8, transformers 2.9.1, and tokenizers 0.7.0.

FYI I'm using poetry and had to run the following:

My pyproject.toml file is the following:

[tool.poetry]
name = "gpt-2-output-dataset"
version = "0.1.0"
description = "GPT-2 output detector"
authors = ["Your Name <you@example.com>"]

[tool.poetry.dependencies]
python = "^3.8.0"
transformers = "2.9.1"
fire = "^0.2.1"
requests = "^2.22.0"
tqdm = "^4.32.2"
torch = "^1.2.0"
tokenizers = "^0.7.0"
tensorboard = "^1.14.0"

[tool.poetry.dev-dependencies]

[build-system]
requires = ["poetry-core>=1.0.0"]
build-backend = "poetry.core.masonry.api"
ilovefreesw commented 1 year ago

Facing the same thing today. image

casheo commented 1 year ago

Same thing today image

With transformer 2.9.1.

NeyokiCat commented 1 year ago

Python3.10.10; transformers==4.26.1; tokenizers==0.13.2 Same with me: 微信图片_20230304195849

niranjanakella commented 1 year ago

I am also facing the same issue can someone please kindly address this.

AshfakYeafi commented 7 months ago

transformers==4.24.0 and tokenizers==0.13.2 solves the issue.