senisioi / NeuralTextSimplification

Exploring Neural Text Simplification
73 stars 23 forks source link
deep-learning neural-network neural-text-simplification opennmt seq2seq simplification

Exploring Neural Text Simplification

Abstract

We present the first attempt at using sequence to sequence neural networks to model text simplification (TS). Unlike the previously proposed automated methods, our neural text simplification (NTS) systems are able to simultaneously perform lexical simplification and content reduction. An extensive human evaluation of the output has shown that NTS systems achieve good grammaticality and meaning preservation of output sentences and higher level of simplification than the state-of-the-art automated TS systems. We train our models on the Wikipedia corpus containing good and good partial alignments.

    @InProceedings{neural-text-simplification,
      author    = {Sergiu Nisioi and Sanja Štajner and Simone Paolo Ponzetto and Liviu P. Dinu},
      title     = {Exploring Neural Text Simplification Models},
      booktitle = {{ACL} {(2)}},
      publisher = {The Association for Computational Linguistics},
      year      = {2017}
    }

Simplify Text | Generate Predictions (no GPUs needed)

  1. OpenNMT dependencies
    1. Install Torch
    2. Install additional packages:
      luarocks install tds
  2. Checkout this repository including the submodules:
    git clone --recursive https://github.com/senisioi/NeuralTextSimplification.git
  3. Download the pre-trained released models NTS and NTS-w2v (NOTE: when using the released pre-trained models, due to recent changes in third party software, the output of our systems might not be identical to the one reported in the paper.)
    python src/download_models.py ./models
  4. Run translate.sh from the scripts dir:
    cd src/scripts
    ./translate.sh
  5. Check the predictions in the results directory:
    cd ../../results_NTS
  6. Run automatic evaluation metrics
    1. Install the python requirements (only nltk is needed)
      pip install -r src/requirements.txt
    2. Run the evaluate script
      python src/evaluate.py ./data/test.en ./data/references/references.tsv ./predictions/

The Content of this Repository

./src

./configs

Contains the OpenNMT config file. To train, please update the config file with the appropriate data on your local system and run

    th train -config $PATH_TO_THIS_DIR/configs/NTS.cfg

./predictions

Contains predictions from previous systems (Wubben et al., 2012), (Glavas and Stajner, 2015), and (Xu et al., 2016), and the generated predictions of the NTS models reported in the paper:

./data

Contains the training, testing, and reference sentences used to train and evaluate our models.