Closed En-J-A closed 4 years ago
when doing multilabel you need to one-hot encode the labels in the label processor (https://github.com/kaushaltrivedi/fast-bert/blob/master/fast_bert/data_cls.py#L306) and try to follow other changes related to num_labels.
My suggestion is to use the Fast-BERT demo notebook, you can change the metric to whatever you need to the BertLearner.
(EX: metrics = [{'name': 'accuracy', 'function': accuracy}, {'name': 'f1', 'function': F1}]
)
If you are running on COLAB make sure the runtime is set to GPU.
If you are running on your computer replace torch.device("cuda")
by torch.device("cpu")
I tried to run Fast-BERT demo notebook, but I have an error at this step
learner.fit(epochs=5,
lr=2e-5,
validate=True, # Evaluate the model after each epoch
schedule_type="warmup_linear",
optimizer_type="adamw")
and the error is
2020-08-30 22:03:42,452 - INFO]: ***** Running training *****
[2020-08-30 22:03:42,454 - INFO]: Num examples = 13815
[2020-08-30 22:03:42,455 - INFO]: Num Epochs = 5
[2020-08-30 22:03:42,455 - INFO]: Total train batch size (w. parallel, distributed & accumulation) = 16
[2020-08-30 22:03:42,457 - INFO]: Gradient Accumulation steps = 1
[2020-08-30 22:03:42,459 - INFO]: Total optimization steps = 4320
0.00% [0/5 00:00<00:00]
100.00% [864/864 05:55<00:00]
[2020-08-30 22:09:38,025 - INFO]: Running evaluation
[2020-08-30 22:09:38,026 - INFO]: Num examples = 1936
[2020-08-30 22:09:38,028 - INFO]: Batch size = 32
100.00% [61/61 00:15<00:00]
---------------------------------------------------------------------------
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-17-e58633857728> in <module>()
3 validate=True, # Evaluate the model after each epoch
4 schedule_type="warmup_linear",
----> 5 optimizer_type="adamw")
3 frames
/usr/local/lib/python3.6/dist-packages/fast_bert/metrics.py in fbeta(y_pred, y_true, thresh, beta, eps, sigmoid)
56 y_pred = (y_pred > thresh).float()
57 y_true = y_true.float()
---> 58 TP = (y_pred * y_true).sum(dim=1)
59 prec = TP / (y_pred.sum(dim=1) + eps)
60 rec = TP / (y_true.sum(dim=1) + eps)
RuntimeError: The size of tensor a (3) must match the size of tensor b (1936) at non-singleton dimension 1
I worked in COLAB, and the runtime was set to GPU.
are your labels one-hot encoded?
this is an example from a script that i used for multilabel classification to build the dataframe before saving it as csv.
sentences_filtered = pd.concat([sentences_filtered['text'],pd.get_dummies(sentences_filtered['label'])],axis=1)
The sentences_filtered['label']
column originally had the labels as text.
Then save it to csv:
sentences_filtered.to_csv("data/train.csv",index=True,columns=sentences_filtered.columns,sep=',',header=True)
and don't forget to multi_label
to True
in both the DataBunch and Learner classes
if you are using the AJGT no need to do one-hot encoding since its a binary classification dataset. Just run the notebook like it is, and it should run without any issue (I just tried it).
Now if you want to use your own multilabel dataset, then you should do one-hot encoding on your labels
Ok. I will try to use one-hot encoding (I have 3 classes, not 2).
train_AJGT = pd.DataFrame(train_AJGT)
train_AJGT['label'] = train_AJGT['label'].apply(str)
test_AJGT = pd.DataFrame(test_AJGT)
test_AJGT['label'] = test_AJGT['label'].apply(str)
train_AJGT = pd.concat([train_AJGT['text'],pd.get_dummies(train_AJGT['label'])],axis=1)
test_AJGT = pd.concat([test_AJGT['text'],pd.get_dummies(test_AJGT['label'])],axis=1)
!mkdir data
train_AJGT.to_csv("data/train.csv",index=True,columns=train_AJGT.columns,sep=',',header=True)
test_AJGT.to_csv("data/dev.csv",index=True,columns=test_AJGT.columns,sep=',',header=True)
with open('data/labels.csv','w') as f:
f.write("0\n1\n2")
after this step, the data will be like this
text 0 1 2
------------------------------------------------
0 XXX 0 1 0
1 XXX 1 0 0
So I have now 3 cols (0,1,2) instead of one (label) when if I need to call BertDataBunch obj , I got an error, I think the reason is there is no column called label in label_col='label'
, but what can I put instead of it?
yes the label_col
should be a list of the label column names : [0,1,2]
I am grateful for what you are doing to help @WissamAntoun . I made all the changes,
databunch = BertDataBunch(
'./data/',
'./data/',
tokenizer=tokenizer,
train_file='train.csv',
val_file='dev.csv',
label_file='labels.csv',
text_col='text',
label_col=[0,1,2],
batch_size_per_gpu=16,
max_seq_length=512, #256
multi_gpu=True,
multi_label=True,
model_type='bert',)
I created labels.csv as
with open('data/labels.csv','w') as f:
f.write("0\n1\n2")
and I faced a problem :
ValueError Traceback (most recent call last)
<ipython-input-22-9e1bdac0802c> in <module>()
18 multi_gpu=True,
19 multi_label=True,
**---> 20 model_type='bert',)**
2 frames
/usr/local/lib/python3.6/dist-packages/fast_bert/data_cls.py in convert_examples_to_features(examples, label_list, max_seq_length, tokenizer, output_mode, cls_token_at_end, pad_on_left, cls_token, sep_token, pad_token, sequence_a_segment_id, sequence_b_segment_id, cls_token_segment_id, pad_token_segment_id, mask_padding_with_zero, logger)
180 label_id = []
181 for label in example.label:
--> 182 **label_id.append(float(label))**
183 else:
184 if example.label is not None:
ValueError: could not convert string to float: 'إن +ها ل+ ال+ قلب مصدر سعاد +ة'
إن +ها ل+ ال+ قلب مصدر سعاد +ة refers to the first raw in text column in training dataset
breviously, I converted the label col from int to str to apply one-hot encoding and in the file
https://github.com/kaushaltrivedi/fast-bert/blob/c91c72327a4150c25645802ffe9175e64cc61fca/fast_bert/data_cls.py#L58
I found the note : cls_token_segment_id define the segment id associated to the CLS token (0 for BERT, 2 for XLNet)
and cls_token_segment_id=1
is it true and does it relate to the error?
Can you try label_col=['0','1','2']
, i think the function is accessing the first column instead of the column named '0'
Yes, It works now. Thank you very much ^_^.
Great, You can close the issue if you want
hello. when I tried to execute the code again, at this step
learner.fit(epochs=10,
lr=2e-5,
validate=True, # Evaluate the model after each epoch
schedule_type="warmup_linear",
optimizer_type="adamw")
I got this error
TypeError Traceback (most recent call last)
<ipython-input-18-78eff0b78623> in <module>()
3 validate=True, # Evaluate the model after each epoch
4 schedule_type="warmup_linear",
----> 5 optimizer_type="adamw")
7 frames
/usr/local/lib/python3.6/dist-packages/torch/tensor.py in __array__(self, dtype)
478 def __array__(self, dtype=None):
479 if dtype is None:
--> 480 return self.numpy()
481 else:
482 return self.numpy().astype(dtype, copy=False)
TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.
I don't have any idea about the reason. If you have please let me know.
can you copy the whole frame stack of the error
[2020-09-02 06:51:34,975 - INFO]: ***** Running training *****
[2020-09-02 06:51:34,978 - INFO]: Num examples = 13815
[2020-09-02 06:51:34,979 - INFO]: Num Epochs = 10
[2020-09-02 06:51:34,981 - INFO]: Total train batch size (w. parallel, distributed & accumulation) = 16
[2020-09-02 06:51:34,984 - INFO]: Gradient Accumulation steps = 1
[2020-09-02 06:51:34,984 - INFO]: Total optimization steps = 8640
0.00% [0/10 00:00<00:00]
100.00% [864/864 05:40<00:00]
[2020-09-02 06:57:15,245 - INFO]: Running evaluation
[2020-09-02 06:57:15,249 - INFO]: Num examples = 1936
[2020-09-02 06:57:15,250 - INFO]: Batch size = 32
100.00% [61/61 00:15<00:00]
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-18-78eff0b78623> in <module>()
3 validate=True, # Evaluate the model after each epoch
4 schedule_type="warmup_linear",
----> 5 optimizer_type="adamw")
7 frames
/usr/local/lib/python3.6/dist-packages/torch/tensor.py in __array__(self, dtype)
478 def __array__(self, dtype=None):
479 if dtype is None:
--> 480 return self.numpy()
481 else:
482 return self.numpy().astype(dtype, copy=False)
TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.
I mean expand the error to get the full error stack trace
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-18-78eff0b78623> in <module>()
3 validate=True, # Evaluate the model after each epoch
4 schedule_type="warmup_linear",
----> 5 optimizer_type="adamw")
7 frames
/usr/local/lib/python3.6/dist-packages/fast_bert/learner_cls.py in fit(self, epochs, lr, validate, return_results, schedule_type, optimizer_type)
421 # Evaluate the model against validation set after every epoch
422 if validate:
--> 423 results = self.validate()
424 for key, value in results.items():
425 self.logger.info(
/usr/local/lib/python3.6/dist-packages/fast_bert/learner_cls.py in validate(self, quiet, loss_only)
515 for metric in self.metrics:
516 validation_scores[metric["name"]] = metric["function"](
--> 517 all_logits, all_labels
518 )
519 results.update(validation_scores)
/usr/local/lib/python3.6/dist-packages/sklearn/metrics/_classification.py in accuracy_score(y_true, y_pred, normalize, sample_weight)
183
184 # Compute accuracy for each possible representation
--> 185 y_type, y_true, y_pred = _check_targets(y_true, y_pred)
186 check_consistent_length(y_true, y_pred, sample_weight)
187 if y_type.startswith('multilabel'):
/usr/local/lib/python3.6/dist-packages/sklearn/metrics/_classification.py in _check_targets(y_true, y_pred)
79 """
80 check_consistent_length(y_true, y_pred)
---> 81 type_true = type_of_target(y_true)
82 type_pred = type_of_target(y_pred)
83
/usr/local/lib/python3.6/dist-packages/sklearn/utils/multiclass.py in type_of_target(y)
245 raise ValueError("y cannot be class 'SparseSeries' or 'SparseArray'")
246
--> 247 if is_multilabel(y):
248 return 'multilabel-indicator'
249
/usr/local/lib/python3.6/dist-packages/sklearn/utils/multiclass.py in is_multilabel(y)
136 """
137 if hasattr(y, '__array__') or isinstance(y, Sequence):
--> 138 y = np.asarray(y)
139 if not (hasattr(y, "shape") and y.ndim == 2 and y.shape[1] > 1):
140 return False
/usr/local/lib/python3.6/dist-packages/numpy/core/_asarray.py in asarray(a, dtype, order)
83
84 """
---> 85 return array(a, dtype, copy=False, order=order)
86
87
/usr/local/lib/python3.6/dist-packages/torch/tensor.py in __array__(self, dtype)
478 def __array__(self, dtype=None):
479 if dtype is None:
--> 480 return self.numpy()
481 else:
482 return self.numpy().astype(dtype, copy=False)
TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.
--
I think I find the error. I added accuracy_score to metrics and it is the reason for the error . Now I removed it , the code is running well.
metrics = [{'name': 'accuracy_m', 'function': accuracy_multilabel},
{'name': 'accuracy_th', 'function': accuracy_thresh },
# {'name': 'accuracy_sc', 'function': accuracy_score }, --> here is the reason
{'name': 'F1', 'function': F1}
]
thank you @WissamAntoun
I want to apply AraBERT on SA for 3 classes (+ve , -ve and Neutral) I want to be sure about can I use the AraBERT_PyTorch_Demo.ipynb file and make some changes :
In compute_metrics function I updated these values:
Can I use the class of
class BinaryProcessor(DataProcessor)
after update its function like:and renamed it as
class MnliProcessor(DataProcessor)
and then updatedI updated num_labels in
config = config_class.from_pretrained(args['model_name'], num_labels=3, finetuning_task=args['task_name'])
In defining the Model Parameters I found that 'task_name': 'binary', Should I replace it with another value?
When I tried to train the model at this step
After doing all of these changes I still have the following error
what can I do to fix this error?