forrestdavis / NLPScholar

Tools for training an NLP Scholar
GNU General Public License v3.0
5 stars 2 forks source link

Issue with running train on TextClassification #3

Closed Eliheakins closed 2 weeks ago

Eliheakins commented 2 weeks ago

Describe the bug

File "/home/eheakins/.conda/envs/nlp/lib/python3.10/site-packages/torch/nn/functional.py", line 3104, in cross_entropy return torch._C._nn.cross_entropy_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index, label_smoothing) ValueError: Expected input batch_size (256) to match target batch_size (15). 0%| Getting this bug when running train on exp. TextClassification specifically with gp2 as model

When using hf_text_classification_model on bert-base-uncased it does not produce this bug with all the same data and config file

To Reproduce

Run train using exp. TextClassification with our data set (google drive link) using our config file (in drive) https://drive.google.com/drive/folders/1F7PjgBe2j9UkwLto3VzUdMpLLxIU02gu?usp=sharing

Expected behavior

Should be creating the train model

Observed behavior

Throwing error Traceback (most recent call last): File "/home/eheakins/NLPScholar/main.py", line 30, in exp.train() File "/home/eheakins/NLPScholar/src/trainers/HFTextClassificationTrainer.py", line 118, in train trainer.train() File "/home/eheakins/.conda/envs/nlp/lib/python3.10/site-packages/transformers/trainer.py", line 1948, in train return inner_training_loop( File "/home/eheakins/.conda/envs/nlp/lib/python3.10/site-packages/transformers/trainer.py", line 2289, in _inner_training_loop tr_loss_step = self.training_step(model, inputs) File "/home/eheakins/.conda/envs/nlp/lib/python3.10/site-packages/transformers/trainer.py", line 3328, in training_step loss = self.compute_loss(model, inputs) File "/home/eheakins/.conda/envs/nlp/lib/python3.10/site-packages/transformers/trainer.py", line 3373, in compute_loss outputs = model(inputs) File "/home/eheakins/.conda/envs/nlp/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl return self._call_impl(*args, *kwargs) File "/home/eheakins/.conda/envs/nlp/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl return forward_call(args, kwargs) File "/home/eheakins/.conda/envs/nlp/lib/python3.10/site-packages/transformers/models/gpt2/modeling_gpt2.py", line 1348, in forward loss = loss_fct(shift_logits.view(-1, shift_logits.size(-1)), shift_labels.view(-1)) File "/home/eheakins/.conda/envs/nlp/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl return self._call_impl(*args, *kwargs) File "/home/eheakins/.conda/envs/nlp/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl return forward_call(args, **kwargs) File "/home/eheakins/.conda/envs/nlp/lib/python3.10/site-packages/torch/nn/modules/loss.py", line 1188, in forward return F.cross_entropy(input, target, weight=self.weight, File "/home/eheakins/.conda/envs/nlp/lib/python3.10/site-packages/torch/nn/functional.py", line 3104, in cross_entropy return torch._C._nn.cross_entropy_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index, label_smoothing) ValueError: Expected input batch_size (256) to match target batch_size (15). 0%|

Screenshots

N/A

Setup (please complete the following information)

Additional context

Add any other context about the problem here.

forrestdavis commented 2 weeks ago

Thank you for opening this and for providing all the necessary details. Your config for gpt2 should also use hf_text_classification_model as the model type. I would guess you switched from this because an error pops up when you try to train it. This was a bug with the code. Please pull the newest version of the toolkit and your problem should be resolved. (Please follow up if this is not the case).