sinantie / NeuralAmr

Sequence-to-sequence models for AMR parsing and generation
http://www.ikonstas.net/code
69 stars 17 forks source link

anonymization results being different between the AMR side and the text side. #4

Closed freesunshine0316 closed 7 years ago

freesunshine0316 commented 7 years ago

I have a parallel (AMR, text) files of "dev.amr.line" and "dev.sent", and have anonymized the AMR side. Currently my directory looks like: lrwxrwxrwx 1 lsong10 lsong10 20 Sep 22 10:36 dev.amr.line -> ../data/dev.amr.line -rw-rw---- 1 lsong10 lsong10 91K Sep 22 10:36 dev.amr.line.alignments -rw-rw---- 1 lsong10 lsong10 334K Sep 22 10:36 dev.amr.line.anonymized lrwxrwxrwx 1 lsong10 lsong10 16 Sep 22 12:15 dev.sent -> ../data/dev.sent

When I anonymize the text side with this commend "./anonDeAnon_java.sh anonymizeText true dev.sent" The file of "dev.sent.alignments" is different from "dev.amr.line.alignments". How can I anonymize the text side according to the graph-side result.

sinantie commented 7 years ago

Just pushed a new feature that processes parallel (AMR, text) corpora such as the LDC or Little Prince corpora. Check the last part of README.md for more details. I have included the Little Prince v.1.6 corpus as an example.