asahi417 / tner

Language model fine-tuning on NER with an easy interface and cross-domain evaluation. "T-NER: An All-Round Python Library for Transformer-based Named Entity Recognition, EACL 2021"
https://aclanthology.org/2021.eacl-demos.7/
MIT License
376 stars 41 forks source link

can't load NER dataset into tner , I've tried to using ai4bharat_namapadham dataset from hugging face but it doesnt load into any of the tner methods, any suggestions on loading this dataset #51

Closed Ananthzeke closed 1 year ago

Ananthzeke commented 1 year ago

INFO:root:INITIALIZE GRID SEARCHER: 1 configs to try INFO:root:## 1st RUN: Configuration 0/1 ## INFO:root:hyperparameters INFO:root: dataset: ai4bharat/naamapadam INFO:root: dataset_split: train INFO:root: dataset_name: ta INFO:root: local_dataset: None INFO:root: model: xlm-roberta-base INFO:root: crf: True INFO:root: max_length: 128 INFO:root: epoch: 10 INFO:root: batch_size: 32 INFO:root: lr: 1e-05 INFO:root: random_seed: 42 INFO:root: gradient_accumulation_steps: 2 INFO:root: weight_decay: None INFO:root: lr_warmup_step_ratio: 0.1 INFO:root: * max_grad_norm: 10 WARNING:datasets.builder:Found cached dataset naamapadam (/root/.cache/huggingface/datasets/ai4bharat___naamapadam/ta/1.0.0/c1b045180d60b208d2468bdad897d04461f08c7137c04a85220697b1bef7df9a)


JSONDecodeError Traceback (most recent call last)

in 16 max_grad_norm=[10] 17 ) ---> 18 searcher.train()

8 frames

/usr/lib/python3.9/json/decoder.py in raw_decode(self, s, idx) 353 obj, end = self.scan_once(s, idx) 354 except StopIteration as err: --> 355 raise JSONDecodeError("Expecting value", s, err.value) from None 356 return obj, end

JSONDecodeError: Expecting value: line 1 column 1 (char 0)