OpenBMB / ModelCenter

Efficient, Low-Resource, Distributed transformer implementation based on BMTrain
https://modelcenter.readthedocs.io
Apache License 2.0
233 stars 28 forks source link

[BUG] Following Quick Start and in step 3 "Prepare the dataset" encountering "KeyError: 'label'" #44

Open jiangzizi opened 6 months ago

jiangzizi commented 6 months ago

I followed the Quick Start and in step 3, when I copied the code to Google Colab and try to run it, I encountered "KeyError: 'label'". I found that there were not key 'label' in BoolQ dataset, and try to directly change the package file, but did not work.


KeyError Traceback (most recent call last) in <cell line: 9>() 8 9 for split in splits: ---> 10 dataset[split] = DATASET['BoolQ']('/content/drive/MyDrive', split, bmt.rank(), bmt.world_size(), tokenizer, max_encoder_length=512) 11 12 batch_size = 64

/usr/local/lib/python3.10/dist-packages/model_center/dataset/bertdataset/superglue.py in init(self, path, split, rank, world_size, tokenizer, max_encoder_length) 90 from tqdm import tqdm 91 for row in self.read_data("BoolQ", path, split, rank, world_size): ---> 92 label = 1 if row["label"]==True else 0 93 text_a = row['passage'] 94 text_b = row['question']

KeyError: 'label'