factula / sumtool

A toolkit for understanding factuality & consistency errors in summarization models.

Setup (python 3.8):

pip install -r requirements.txt
pip install .

Run Streamlit app

streamlit run interface/app.py

You can also run interfaces individually, i.e.

streamlit run interface/summary_interface.py


pip install -r requirements.dev.txt
pip install -Ue .

Before commiting:

black sumtool/ interface/ scripts/
flake8 sumtool/ interface/ scripts/

Run on Google Colab for GPU

  1. Create a Github token to access your private repositories. Follow these steps here: Github: Creating a Personal Access Token

  2. Create a new Colab notebook and set the runtime type to GPU

  3. Add the following commands in the first cell to clone the repository and install the requirements

    !git clone https://[your-git-token]@github.com/cs6741/summary-analysis.git
    !pip install -r /content/summary-analysis/requirements.txt
  4. Add the following command to run the text generation script

    !python /content/generate_xsum_summary.py --bbc_ids [idx1,idx2] --data_split [train|test]

Storage documentation

Pipeline for storage:

  1. Store generated summaries
    • by generating them using a custom model (example)
    • by loading them from an external dataset/paper (example)
  2. Compute summary metrics for stored summaries using sumtool.


