JohnGiorgi / seq2rel-ds

This is a companion repository to seq2rel (https://github.com/JohnGiorgi/seq2rel) which aims to make it easy to generate training data.
5 stars 1 forks source link

Compute corpus statistics #64

Open JohnGiorgi opened 2 years ago

JohnGiorgi commented 2 years ago

I created a branch, compute-corpus-statistics, that has code to compute corpus statistics, mainly in order to compute the fraction of inter-sentence relations which we report in the paper. This code isn't particularly pretty and I don't really have the time to clean it up to merge into main, so I am just going to leave it on its own branch in case it's needed in the future. Opening this issue so that I don't forget.