Closed widyaputeriaulia10 closed 1 year ago
Hi @widyaputeriaulia10 , thank you for using IndoNLU.
I couldn't comment much on the problem, since the data nor the exact loading script are provided.
One possible problem that I think can produce this error is that perhaps your CSV file has a header row.
In our case, we use df = pd.read_csv(path, sep='\t', header=None)
since there is no column information in the data.
If your CSV contains a header row, you can omit the header=None
when loading the dataset.
Let me know if the problem persists, and please also send the code snippet, error message, and view of the data, so that it will be easier for us to trace the problem.
Thank you and hope it helps!
dear author, thank you for your responds i just followed your instruction, but other error just occurred. this i attached the code and error massage
n_epochs = 4 for epoch in range(n_epochs): model.train() torch.set_grad_enabled(True)
total_train_loss = 0
list_hyp_train, list_label = [], []
train_pbar = tqdm(train_loader, leave=True, total=len(train_loader))
for i, batch_data in enumerate(train_pbar):
# Forward model
loss, batch_hyp, batch_label =
forward_sequence_classification(model, batch_data[:-1], i2w=i2w, device='cuda')
# Update model
optimizer.zero_grad()
loss.backward()
optimizer.step()
tr_loss = loss.item()
total_train_loss = total_train_loss + tr_loss
# Hitung skor train metrics
list_hyp_train += batch_hyp
list_label += batch_label
train_pbar.set_description("(Epoch {}) TRAIN LOSS:{:.4f}
LR:{:.8f}".format((epoch+1), total_train_loss/(i+1), get_lr(optimizer)))
metrics = document_sentiment_metrics_fn(list_hyp_train, list_label)
print("(Epoch {}) TRAIN LOSS:{:.4f} {} LR:{:.8f}".format((epoch+1),
total_train_loss/(i+1), metrics_to_string(metrics),
get_lr(optimizer)))
# Evaluate di validation set
model.eval()
torch.set_grad_enabled(False)
total_loss, total_correct, total_labels = 0, 0, 0
list_hyp, list_label = [], []
pbar = tqdm(valid_loader, leave=True, total=len(valid_loader))
for i, batch_data in enumerate(pbar):
batch_seq = batch_data[-1]
loss, batch_hyp, batch_label =
forward_sequence_classification(model, batch_data[:-1], i2w=i2w, device='cuda')
# Hitung total loss
valid_loss = loss.item()
total_loss = total_loss + valid_loss
# Hitung skor evaluation metrics
list_hyp += batch_hyp
list_label += batch_label
metrics = document_sentiment_metrics_fn(list_hyp, list_label)
pbar.set_description("VALID LOSS:{:.4f}
{}".format(total_loss/(i+1), metrics_to_string(metrics)))
metrics = document_sentiment_metrics_fn(list_hyp, list_label)
print("(Epoch {}) VALID LOSS:{:.4f} {}".format((epoch+1),
total_loss/(i+1), metrics_to_string(metrics)))
this is the error
0%| | 0/16 [00:01<?, ?it/s]
---------------------------------------------------------------------------RuntimeError
Traceback (most recent call
last)/tmp/ipykernel_23/3084095572.py in
and this is the data that i saved to .tsv format [image: image.png] i think the error occured because of the accelerator that i used (i used kaggle GPU P100), but is you have other opinion and insight, it will be helpfull, thank you !
Pada tanggal Jum, 17 Mar 2023 pukul 10.07 Samuel Cahyawijaya < @.***> menulis:
Hi @widyaputeriaulia10 https://github.com/widyaputeriaulia10 , thank you for using IndoNLU.
I couldn't comment much on the problem, since the data nor the exact loading script are provided.
One possible problem that I think can produce this error is that perhaps your CSV file has a header row. In our case, we use df = pd.read_csv(path, sep='\t', header=None) since there is no column information in the data.
If your CSV contains a header row, you can omit the header=None when loading the dataset.
Let me know if the problem persists, and please also send the code snippet, error message, and view of the data, so that it will be easier for us to trace the problem.
Thank you and hope it helps!
— Reply to this email directly, view it on GitHub https://github.com/IndoNLP/indonlu/issues/43#issuecomment-1473051556, or unsubscribe https://github.com/notifications/unsubscribe-auth/APSA3T27QJ464RE64Z6OV7TW4PIQBANCNFSM6AAAAAAV57COUI . You are receiving this because you were mentioned.Message ID: @.***>
Hi samuel, i trained the model using cpu and it works, thank you
Expected Behavior
Dear Author,
I want to make multiclass classification by modify DocumentSentimentDataset,
class DocumentSentimentDataset(Dataset):
Static constant variable
but when i started to train the model i got error like this :
ValueError: Caught ValueError in DataLoader worker process 12. Original Traceback (most recent call last): File "/opt/conda/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 302, in _worker_loop data = fetcher.fetch(index) File "/opt/conda/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 61, in fetch return self.collate_fn(data) File "/kaggle/working/indonlu/utils/data_utils.py", line 550, in _collate_fn sentiment_batch[i,0] = sentiment ValueError: invalid literal for int() with base 10: 'sentiment'
i have checked that 'setniment' column was int.
Do you have any advices to my problem ?
Thank You in Advance