Yale-LILY / SummerTime

An open-source text summarization toolkit for non-experts. EMNLP'2021 Demo
https://arxiv.org/abs/2108.12738
Apache License 2.0
264 stars 30 forks source link

evaluation refactoring #23

Closed zhangir-azerbayev closed 3 years ago

zhangir-azerbayev commented 3 years ago

Modified evaluation library to better align with style conventions.

One thing I can't figure out how to do is import SummModel into base_metric.py for type annotation purposes. Any help with this is appreciated.

zhangir-azerbayev commented 3 years ago

Hi Ansong, I fixed the issues raised in your comments with a new commit. Let me know what you think. I've also added a new class called SummEvalMetric, which inherits from SummMetric and is specifically for metrics that use SummEval as a backend (as of right now, this is all of them) .

niansong1996 commented 3 years ago

@zhangir-azerbayev Did you forget to check the summeval_metric.py into git? It seems to be missing from my end.

Also, there are some relative imports, which we should try to avoid.

Other than those, I think the class design makes sense to me and is much better than last time :)

zhangir-azerbayev commented 3 years ago

@niansong1996 Sorry, I forgot to check the summeval_metric.py, this is fixed now. I also got rid of the relative imports.

niansong1996 commented 3 years ago

@zhangir-azerbayev I think one other thing that is missing is testing.

Can you follow what we have in tests and add a eval_test.py and add some testing, to make sure it works as expected?

niansong1996 commented 3 years ago

The commits looks good so far, let me know when you resolved all my previous comments by requesting a review, then I can see if this can be merged into main, thanks!

niansong1996 commented 3 years ago

@zhangir-azerbayev Where are we on this thread?

zhangir-azerbayev commented 3 years ago

@zhangir-azerbayev Where are we on this thread?

Hi @niansong1996. I added a commit with unit testing.

niansong1996 commented 3 years ago

@zhangir-azerbayev Please resolve the comments I made above, also there is currently a conflict on demo.ipynb, have you made any important changes to that file?

zhangir-azerbayev commented 3 years ago

Should be ready to merge.

niansong1996 commented 3 years ago

@zhangir-azerbayev Why is RougeWE still removed? I thought we fixed the loading issue?

zhangir-azerbayev commented 3 years ago

@niansong1996 good catch, I fixed it.

niansong1996 commented 3 years ago

@zhangir-azerbayev LGTM, are all of the evaluation metrics passing the test? If so, I think this is ready to merge

zhangir-azerbayev commented 3 years ago

@niansong1996 Found a minor bug in the testing script. Tests now pass.