mstrise / dep2label

Dependency Parsing as Sequence Labeling
MIT License
26 stars 1 forks source link

[BUG] #3

Closed cjf9028 closed 5 years ago

cjf9028 commented 5 years ago

Are you able to run the usage commands from the README.md by following the exact instructions? Yes/Not, and why not?

Describe the bug A clear and concise description of what the bug is.

To Reproduce Steps to reproduce the behavior:

  1. Edit the X configuration file and change Y parameter '...'

dataset conll format korean data

in seq file

민주화운동 [POStag]NNP -1_VP_ROOT 잔여 [POStag]NNG -1_VP_ROOT 과제 [POStag]NNG -1_VP_ROOT

  1. Run this command '....'

python main.py --train-config config/train.config --decode-config config/decode.config

  1. See error

Predict raw result has been written into file. myModel/output_nn.out Traceback (most recent call last): File "main.py", line 476, in train(train_data,decode,args) File "main.py", line 369, in train diction, words = dev_enc.decode(decode_data.output_nn, data.encoding, all_sent) File "/home/nlplab/Development/ckc/parsing/dep2label/labeling.py", line 237, in decode decoded_sentence, decoded_words, homeless_nodes) File "/home/nlplab/Development/ckc/parsing/dep2label/decoding.py", line 83, in decode_3 position_head = int(info_about_word[5]) ValueError: invalid literal for int() with base 10: '+1_NP'

Desktop (please complete the following information):

Additional context Add any other context about the problem here.

mstrise commented 5 years ago

Hi! What dataset do you use? Does it come from the Universal Dependencies? Which version of Korean treebank is it? I will test it on my machine.

cjf9028 commented 5 years ago

please give me your email i will send data via email

cjf9028 commented 5 years ago

my e-mail is cjf9028@daum.net

mstrise commented 5 years ago

Hi! For me it looks like something is wrong with the file. Be sure that it is in CONLL-X format. The file should have format like here: https://universaldependencies.org/format.html

cjf9028 commented 5 years ago

this data is korean stand format if except phrase_postag if i use this format which part have to change

now i add DEPREL in conll18_ud_eval.py (CONTENT_DEPRELS)

mstrise commented 5 years ago

The code only works with the CONLL-X format but it is possible to modify it in encoding.py and decoding.py files.