Closed HannahYY closed 4 years ago
Where is the text annotation file? Or what is the name ? thanks Did you finally figure out ANNOTAION=path
Yes, and I guess ANNOTATION = data2text-1/ie*/rotowire-modified-anno.txt . Is it right?
Hi, sorry for the late response. You can make an annotation file using the IE model we provide. After running the setup.sh, first you need to make the gold text file for training data, which is tokenized by NLTK.
TRAIN_TXT=train.txt
cat train.json | python -c 'import sys, json, nltk; print("\n".join(" ".join(nltk.word_tokenize(" ".join(x["summary"]))) for x in json.load(sys.stdin)))' > $TRAIN_TXT
Then, you can run the following command to obtain the annotation file for training data.
python data_utils.py -mode prep_gen_data -gen_fi $TRAIN_TXT$ -dict_pfx "rotowire-modified-ie" -output_fi train_gold.h5 -input_path "../rotowire_v2" -train
th extractor.lua -gpuid 1 -datafile rotowire-modified-ie.h5 -preddata train_gold.h5 -dict_pfx "rotowire-modified-ie" -just_eval
Finally, you can find the annotaion file, train_gold.h5-tuples.txt
, in the same directory.
As @HannahYY mentioned, the attached file, rotowire-modified-anno.txt, is also retrieved with this procedure. I'll write down a more detailed procedure for making an annotation file around mid of Dec.
Thank you for your reply, but I have not solved the problem yet.
Are your python 2.7 and dynet 2.1 versions respectively?
My implementation steps are as follows:
1、 I preprocessed “python make_data.py $DATA $ANNOTATION $VOCAB”, the dump folder was generated "Reporter_nh_vocab-128_nh_rnn-512_writer_15.dy". is right?
2、And then, executive training "python reporter.py train ../dump/Reporter_nh_vocab-128_nh_rnn-512_26.dy --valid_file ../rotowire_v2/valid.json",
But the following error occurred:
[dynet] random seed: 2900451995
[dynet] using autobatching
[dynet] allocating memory: 7544MB
[dynet] memory allocation done.
2019-11-25 17:03:32.382964 Log dir at /tmp/1574672612
2019-11-25 17:03:32.383040 Loading dataset...
Traceback (most recent call last):
File "reporter.py", line 112, in
Is this a problem with my installation package version?
@hdb1301040027 No, you can use Python >= 3.6 :) In addition, ../dump/Reporter_nh_vocab-128_nh_rnn-512_26.dy is the model dump file, not the vocab file. You can run make_data.py to get a vocab (& data) file before training.
Hi! Sorry for the late but I've updated the README.md. I hope it would be helpful for reproducing our research :)
Where is the text annotation file? Or what is the name ? thanks