Closed freesunshine0316 closed 7 years ago
Just pushed a new feature that processes parallel (AMR, text) corpora such as the LDC or Little Prince corpora. Check the last part of README.md for more details. I have included the Little Prince v.1.6 corpus as an example.
I have a parallel (AMR, text) files of "dev.amr.line" and "dev.sent", and have anonymized the AMR side. Currently my directory looks like: lrwxrwxrwx 1 lsong10 lsong10 20 Sep 22 10:36 dev.amr.line -> ../data/dev.amr.line -rw-rw---- 1 lsong10 lsong10 91K Sep 22 10:36 dev.amr.line.alignments -rw-rw---- 1 lsong10 lsong10 334K Sep 22 10:36 dev.amr.line.anonymized lrwxrwxrwx 1 lsong10 lsong10 16 Sep 22 12:15 dev.sent -> ../data/dev.sent
When I anonymize the text side with this commend "./anonDeAnon_java.sh anonymizeText true dev.sent" The file of "dev.sent.alignments" is different from "dev.amr.line.alignments". How can I anonymize the text side according to the graph-side result.