Open Cartus opened 5 years ago
Our baseline input could be the same linearized amr chart as konstas. Only concept nodes are retained for input to the transformer model. -train_src # concept node sequence -train_structure1 # Xi to Xj path of the first token. -train_structure2 # Xi to Xj path of the second token. ........
Hi @Amazing-J ,
Thank you for your prompt reply!
For the concept node sequence, I can use NeuralAmr https://github.com/sinantie/NeuralAmr to get the linearized sequence.
I also have two questions. The first one is how to construct the structural sequence. Since the model requires to sub-word units by BPE, how to generate the concept node sequence under this setting?
Hi @Amazing-J,
Thank you for releasing the code! As @Cartus pointed out, can you provide the code for BPE over the source a.k.a linearized AMRs?
Best!
Assuming that I've done the right thing for BPE by running
subword-nmt learn-bpe -s 10000 < ...LDC2015E86/training_source > codes.bpe
subword-nmt apply-bpe -c codes.bpe < ...LDC2015E86/dev_source > dev_source_bpe
then I still got this error:
FileNotFoundError: [Errno 2] No such file or directory: ...LDC2015E86/data_vocab.pt
How can I generate this file?
Alright, I found out that I also have to run preprocess.sh
. Thanks!
Hi, thanks for the great work!
I try to run the code. However, I don't know how to do data preprocessing for AMR corpus. May I ask how can I do data preprocessing?