Yale-LILY / SummerTime

An open-source text summarization toolkit for non-experts. EMNLP'2021 Demo
https://arxiv.org/abs/2108.12738
Apache License 2.0
264 stars 30 forks source link

Multilingual refactoring and language ID checking #96

Closed haileyschoelkopf closed 2 years ago

haileyschoelkopf commented 2 years ago

Creating MultilingualSummModel base class and assert_summ_input_language() class method. Language ID checking still in progress

haileyschoelkopf commented 2 years ago

Ready for review. One thing still potentially todo: caching Fasttext pretrained model file so it doesn't have to redownload.

@MuroriM would you have any suggestions for where I could read about how to cache a download since you were working on caching for our datasets? No worries if not!

niansong1996 commented 2 years ago

Thanks for the PR, Nick! So for model caching, you can use this file. Let me know if you have any problems using that.

Do you want to add it in this PR or open a second PR?

haileyschoelkopf commented 2 years ago

Thanks @niansong1996 , will notify you when I've made these changes. I've added the caching of the fasttext model in with the mT5 PR (sorry that these branches are based off each other and have become slightly disorganized!)

haileyschoelkopf commented 2 years ago

@niansong1996 Should have fixed all your comments on this PR! The code should now be more readable, let me know if the comments are too verbose though