Alibaba-NLP / ACE

[ACL-IJCNLP 2021] Automated Concatenation of Embeddings for Structured Prediction
Other
298 stars 44 forks source link

Error when loading custom dataset #54

Closed DanrunFR closed 1 year ago

DanrunFR commented 1 year ago

Hello,

I tried to train a NER model on a custom dataset, and I followed your guide to add corpus info in the config file:

targets: ner
ner:
  # Corpus: WIKINER_FRENCH
  # tag_dictionary: resources/taggers/ner_tags.pkl
  Corpus: Wikiner_bioes-1
  Wikiner_bioes-1: 
    data_folder: /share/home/cao/1_benchmark_NER/ACE/bilstm_crf/bi-lstm-crf-ner-tf2.0/data/wikiner_ace/
    column_format:
      0: text
      1: pos
      2: ner
    tag_to_bioes: ner
  tag_dictionary: resources/taggers/ner_tags.pkl

But I got this error when running the training:

Traceback (most recent call last):
  File "train.py", line 83, in <module>
    config = ConfigParser(config,all=args.all,zero_shot=args.zeroshot,other_shot=args.other,predict=args.predict)
  File "/share/castor/home/cao/1_benchmark_NER/ACE/flair/config_parser.py", line 63, in __init__
    self.corpus: ListCorpus=self.get_corpus
  File "/share/castor/home/cao/1_benchmark_NER/ACE/flair/config_parser.py", line 329, in get_corpus
    current_dataset=getattr(datasets,corpus)(tag_to_bioes=self.target)
AttributeError: module 'flair.datasets' has no attribute 'Wikiner_bioes-1'

I don't understand what went wrong. Should I put the dataset somewhere else maybe ?

JiangYong2014 commented 1 year ago

hi, you are encouraged to use the AdaSeq library, which is much more straightforward and easy to use than ACE.

For example, you can try the BERT-CRF model. With one line of command, it is easy to train a NER model.

DanrunFR commented 1 year ago

Actually I'm working on a NER benchmark, and I was trying to ACE on my dataset. Can I use ACE in the AdaSeq library maybe ?

JiangYong2014 commented 1 year ago

Oh, I see. Currently, AdaSeq does not support ACE yet. Hopefully, AdaSeq may support ACE in the future.

wangxinyu0922 commented 1 year ago

Hi, you are using the wrong corpus name. You may set:

Corpus: ColumnCorpus-Wikiner_bioes
ColumnCorpus-Wikiner_bioes:
...

See https://github.com/Alibaba-NLP/ACE#train-on-your-own-dataset for more details.

DanrunFR commented 1 year ago

That worked ! Thank you !