Lightning-AI / pytorch-lightning

Pretrain, finetune ANY AI model of ANY size on multiple GPUs, TPUs with zero code changes.
https://lightning.ai
Apache License 2.0
28.26k stars 3.38k forks source link

TypeError: accuracy() missing 1 required positional argument: 'task' #18256

Closed andysingal closed 1 year ago

andysingal commented 1 year ago

Bug description

old code does not work anymore

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
[<ipython-input-24-0de98dbbb444>](https://localhost:8080/#) in <cell line: 4>()
      2 
      3 trainer = pl.Trainer(fast_dev_run=True, devices=1, accelerator="gpu")
----> 4 trainer.fit(model)

25 frames
[<ipython-input-23-ce55fa773c61>](https://localhost:8080/#) in training_step(self, batch, batch_idx)
     77       outputs = self(encode_id, mask)
     78       preds = torch.argmax(outputs, dim=1)
---> 79       train_accuracy = accuracy(preds, targets)
     80       loss = self.loss(outputs, targets)
     81       self.log('train_accuracy', train_accuracy, prog_bar=True, on_step=False, on_epoch=True)

TypeError: accuracy() missing 1 required positional argument: 'task'

What version are you seeing the problem on?

v1.6, v1.9

How to reproduce the bug

"""
IMPORTANT NOTE
Any input text data that is less than the max_seq_len value will be padded,
and anything bigger will be trimmed down.
"""
class HealthClaimClassifier(pl.LightningModule):

    def __init__(self, max_seq_len=512, batch_size=128, learning_rate = 0.001):
        super().__init__()
        self.learning_rate = learning_rate
        self.max_seq_len = max_seq_len
        self.batch_size = batch_size
        self.loss = nn.CrossEntropyLoss()

        self.pretrain_model  = AutoModel.from_pretrained('bert-base-uncased')
        self.pretrain_model.eval()
        for param in self.pretrain_model.parameters():
            param.requires_grad = False

        self.new_layers = nn.Sequential(
            nn.Linear(768, 512),
            nn.ReLU(),
            nn.Dropout(0.2),
            nn.Linear(512,4),
            nn.LogSoftmax(dim=1)
        )

    def prepare_data(self):
      tokenizer = BertTokenizerFast.from_pretrained('bert-base-uncased')

      tokens_train = tokenizer.batch_encode_plus(
          pub_health_train["main_text"].tolist(),
          max_length = self.max_seq_len,
          pad_to_max_length=True,
          truncation=True,
          return_token_type_ids=False
      )

      tokens_test = tokenizer.batch_encode_plus(
          pub_health_test["main_text"].tolist(),
          max_length = self.max_seq_len,
          pad_to_max_length=True,
          truncation=True,
          return_token_type_ids=False
      )

      '''
      Now we need to create features and extract the target variable from the dataset.
      '''
      self.train_seq = torch.tensor(tokens_train['input_ids'])
      self.train_mask = torch.tensor(tokens_train['attention_mask'])
      self.train_y = torch.tensor(pub_health_train["label"].tolist())

      self.test_seq = torch.tensor(tokens_test['input_ids'])
      self.test_mask = torch.tensor(tokens_test['attention_mask'])
      self.test_y = torch.tensor(pub_health_test["label"].tolist())

    def forward(self, encode_id, mask):
        _, output= self.pretrain_model(encode_id, attention_mask=mask,return_dict=False)
        output = self.new_layers(output)
        return output

    def train_dataloader(self):
      train_dataset = TensorDataset(self.train_seq, self.train_mask, self.train_y)
      self.train_dataloader_obj = DataLoader(train_dataset, batch_size=self.batch_size)
      return self.train_dataloader_obj

    def test_dataloader(self):
      test_dataset = TensorDataset(self.test_seq, self.test_mask, self.test_y)
      self.test_dataloader_obj = DataLoader(test_dataset, batch_size=self.batch_size)
      return self.test_dataloader_obj

    def training_step(self, batch, batch_idx):
      encode_id, mask, targets = batch
      outputs = self(encode_id, mask) 
      preds = torch.argmax(outputs, dim=1)
      train_accuracy = accuracy(preds, targets)
      loss = self.loss(outputs, targets)
      self.log('train_accuracy', train_accuracy, prog_bar=True, on_step=False, on_epoch=True)
      self.log('train_loss', loss, on_step=False, on_epoch=True)
      return {"loss":loss, 'train_accuracy': train_accuracy}

    def test_step(self, batch, batch_idx):
      encode_id, mask, targets = batch
      outputs = self.forward(encode_id, mask)
      preds = torch.argmax(outputs, dim=1)
      test_accuracy = accuracy(preds, targets,task="multiclass")
      loss = self.loss(outputs, targets)
      return {"test_loss":loss, "test_accuracy":test_accuracy}

    def test_epoch_end(self, outputs):
      test_outs = []
      for test_out in outputs:
          out = test_out['test_accuracy']
          test_outs.append(out)
      total_test_accuracy = torch.stack(test_outs).mean()
      self.log('total_test_accuracy', total_test_accuracy, on_step=False, on_epoch=True)
      return total_test_accuracy

    def configure_optimizers(self):
      params = self.parameters()
      optimizer = optim.Adam(params=params, lr = self.learning_rate)
      return optimizer

### Error messages and logs

Error messages and logs here please


### Environment

<details>
  <summary>Current environment</summary>

- Lightning Component (e.g. Trainer, LightningModule, LightningApp, LightningWork, LightningFlow):

- PyTorch Lightning Version (e.g., 1.5.0):

- Lightning App Version (e.g., 0.5.2):

- PyTorch Version (e.g., 2.0):

- Python version (e.g., 3.9):

- OS (e.g., Linux):

- CUDA/cuDNN version:

- GPU models and configuration:

- How you installed Lightning(conda, pip, source):

- Running environment of LightningApp (e.g. local, cloud):



</details>

### More info

same as above 
ishandutta0098 commented 1 year ago

@andysingal The task parameter has been introduced in the newer versions which needs to be passed to the metrics. For example if it is a binary classification task, pass task="binary". You can check an example in my colab notebook - link

awaelchli commented 1 year ago

Thanks @ishandutta0098 for answering this. Yes I confirm the task argument is new in torchmetrics and required. If you don't want to add the task, you can also explicitly use the binary_accuracy or multiclass_accuracy functions directly.

Let us know if that works for you @andysingal 😃

andysingal commented 1 year ago

Thanks @ishandutta0098 for answering this. Yes I confirm the task argument is new in torchmetrics and required. If you don't want to add the task, you can also explicitly use the binary_accuracy or multiclass_accuracy functions directly.

Let us know if that works for you @andysingal 😃

Thank you @awaelchli @ishandutta0098 , still new to pytorch lightning and it is great learning experience for me.

awaelchli commented 1 year ago

Glad to hear that @andysingal, thanks!

If you ever have more questions about Lightning, we also have A discord chat: https://discord.com/invite/XncpTy7DSt and a forum: https://lightning.ai/forums/

andysingal commented 1 year ago

Thanks Adrian, I joined the discord channel. Again, thanks for all your help.

On Wed, Aug 9, 2023 at 16:18 Adrian Wälchli @.***> wrote:

Glad to hear that @andysingal https://github.com/andysingal, thanks!

If you ever have more questions about Lightning, we also have A discord chat: https://discord.com/invite/XncpTy7DSt and a forum: https://lightning.ai/forums/

— Reply to this email directly, view it on GitHub https://github.com/Lightning-AI/lightning/issues/18256#issuecomment-1671095106, or unsubscribe https://github.com/notifications/unsubscribe-auth/AE4LJNMZXD7ECFBRNAVAYNDXUNTGZANCNFSM6AAAAAA3IKMS6Q . You are receiving this because you were mentioned.Message ID: @.***>