How can I train the parser using other corpus?

CarolLi commented 5 years ago

I think the pre-trained model is too large for my task, therefore, I want to train the model using another corpus. Is there any format requirements for the training data?

danielhers commented 5 years ago

Do you want to train it for UCCA parsing, or for parsing text to another representation? If UCCA parsing, you can use any of the UCCA-annotated corpora: https://github.com/UniversalConceptualCognitiveAnnotation It's best to use the sentence-split files, which are under the master-sentences-xml branch in each of these repositories:

CarolLi commented 5 years ago

Do you want to train it for UCCA parsing, or for parsing text to another representation? If UCCA parsing, you can use any of the UCCA-annotated corpora: https://github.com/UniversalConceptualCognitiveAnnotation It's best to use the sentence-split files, which are under the master-sentences-xml branch in each of these repositories:

https://github.com/UniversalConceptualCognitiveAnnotation/UCCA_English-Wiki/tree/master-sentences-xml

https://github.com/UniversalConceptualCognitiveAnnotation/UCCA_English-EWT/tree/master-sentences-xml

https://github.com/UniversalConceptualCognitiveAnnotation/UCCA_English-20K/tree/master-sentences-xml

https://github.com/UniversalConceptualCognitiveAnnotation/UCCA_German-20K/tree/master-sentences-xml

https://github.com/UniversalConceptualCognitiveAnnotation/UCCA_French-20K/tree/master-sentences-xml

Got it! Thank you~

danielhers / tupa

How can I train the parser using other corpus? #71