cocoxu / simplification

Text Simplification System and Dataset
GNU General Public License v3.0
123 stars 37 forks source link

STAR - corpus-level SARI #5

Closed lbercken closed 6 years ago

lbercken commented 6 years ago

In the readme it is mentioned

Note that STAR is corpus-level version of SARI, SARI is sentence-level.

But, I couldn't find information on STAR anywhere. So, how is it calculated or is it simply the mean of all SARI sentence scores?

cocoxu commented 6 years ago

The code for STAR is packed in the zip file linked in the README: ./ppdb-simplification-release-joshua5.0/joshua/src/joshua/metris/STAR.java

"./ppdb-simplification-release-joshua5.0.zip (a 281M file) The experiments in our TACL 2016 paper used the Joshua 5.0. Example scripts for training the simplification are under the directory ./bin/. Note that STAR is corpus-level version of SARI, SARI is sentence-level. The joshua_TACL2016.config is also provided -- that is corresponding to the best system in our paper. You may find the Joshua pipeline tutorial useful."

cocoxu commented 6 years ago

similar to corpus-based BLEU vs sentence-level BLEU -- it is calculated over all sentences in the corpus (different from simply the mean of all SARI sentence scores).

lbercken commented 6 years ago

Thanks for your quick reply!