thunlp / PL-Marker

Source code for "Packed Levitated Marker for Entity and Relation Extraction"
MIT License
260 stars 35 forks source link

What's the difference between "transformers" in a project and executing "pip install transformers" directly? #27

Closed HuizhaoWang closed 2 years ago

HuizhaoWang commented 2 years ago

Thanks for sharing such interesting work. I want to learning something from the released code, but I meet some problems.

  1. the error "ImportError: cannot import name 'BertForNER'" just like #16. I found that the above error will disappear if I execute "pip3 install --editable transformers" in the project. What's the difference between "transformers" in a project and executing "pip install transformers" directly? Further, if I just want to "pip install transformers" directly, how to address the "ImportError: cannot import name 'BertForNER'" error?
  2. I also found there are three script files for NER ([run_train_ner_BIO.sh | [run_train_ner_PLMarker.sh | run_train_ner_TokenCat.sh]),they seem to correspond to different types of baseline methods as mentioned in paper. In additon, in "run_train_ner_PLMarker.sh", the "run_ner.py" file is used for some datasets, however, for some other datasets, the file "run_acener.py" is used. Is there any reason behind this operation? Further, what's the difference between "run_acener.py“ and " "run_ner.py"? When to use the former and when to use the latter?
  3. There are several key arguments for NER (run_acener.py), but I didn't find a detailed explanation. For example, ”max_pair_length = 256“,does this mean that at most 256 candidate spans are considered per run?
YeDeming commented 2 years ago
  1. You can also use pip install ./transformers to highlight the local dir.
  2. run_ner.py is used for conll03 format. The only difference between them is the data format.
  3. Yes, ”max_pair_length = 256“,does mean that at most 256 candidate spans per run.