achen353 / TransformerSum

BERT-based extractive summarizer for long legal document using a divide-and-conquer approach
GNU General Public License v3.0
3 stars 0 forks source link

Test Training with Extractive BillSum (TransformerSum style) #9

Closed achen353 closed 2 years ago

achen353 commented 2 years ago

Try out using the dataset generated by convert_to_extractive.py to train the model

achen353 commented 2 years ago

TransformerSum style preprocess does not take into account of the numbering of the bullet points (e.g. (1) an apple becomes ['(', '1', ')', 'an', 'apple']

Test the training after #14 is done instead.

achen353 commented 2 years ago

No longer applicable