Yale-LILY / SummerTime

An open-source text summarization toolkit for non-experts. EMNLP'2021 Demo
https://arxiv.org/abs/2108.12738
Apache License 2.0
269 stars 30 forks source link

Zhangir/model selector #39

Closed zhangir-azerbayev closed 3 years ago

zhangir-azerbayev commented 3 years ago

Very quick and dirty implementation of the model selector, including halving grid search.

zhangir-azerbayev commented 3 years ago

Definitely not ready to be merged.

niansong1996 commented 3 years ago

@zhangir-azerbayev Have you commit the scatter plot?

Also, I think the model selector lacks comments and it's hard for me to understand what is happening, maybe it's was because of a rush-to-the-deadline (which is okay), but please make necessary changes to make the code more robust.

zhangir-azerbayev commented 3 years ago

Some functions lack type annotations because I would have to import model and data from within evaluation. How should this be done safely?

niansong1996 commented 3 years ago

You can do it with relative import, let me know when this is ready to be reviewed again

zhangir-azerbayev commented 3 years ago

This PR is mostly ready. There are two outstanding issues.

  1. I still can't get the import ..model syntax to work for type annotations. Python treats evaluation as a package and doesn't let you .. outside of the current package. I'm not sure if it's possible to work around this issue. If it isn't, it's not that big a deal because it doesn't affect functionality and the docstrings provide more detailed type information.
  2. Prettytable turned out not to be a suitable lightweight alternative to Pandas, because prettytable doesn't have a good API for retrieving information from a table. We could either keep Pandas or look for another solution.
niansong1996 commented 3 years ago

Responding to the issues:

  1. You should try to avoid relative import unless it's in the same package, see (this post)[https://stackoverflow.com/questions/14132789/relative-imports-for-the-billionth-time] for a detailed explanation of how to do relative import in the right way.
  2. Prettytable can provide table printing functionality, and that's where we want to use it (display the performance table across models and datasets in the command line); for the data structure, I assume you can use things like List[List[Any]]? I mean, using pandas here just for the data structure of a table seems like overkill to me.
zhangir-azerbayev commented 3 years ago

Created and integrated EvaluationTable class to replaced pandas dataframe; added unittests and verified they pass. Should be ready to merge.

zhangir-azerbayev commented 3 years ago

Addressed all outstanding comments.

niansong1996 commented 3 years ago

Please see if absolute import fixes the problems.

This PR has been around for a while now, let's try to merge it ASAP.

niansong1996 commented 3 years ago

@zhangir-azerbayev We have a CI system that does automatically style check and testing now, and it seems that this PR did not pass either. Please check the errors and fix it accordingly, thanks

zhangir-azerbayev commented 3 years ago

@troyfeng116 I can't seem to find the issue the CI test is failing at. It says there is a mixed use of tabs and spaces on line 36 of model_selector.py, but it looks fine to me.

zhangir-azerbayev commented 3 years ago

@troyfeng116 This is weird: the exact same commit looks different depending on which editor I use. Vim shows me this: Screenshot (13)

But the github editor shows this

Screenshot (14)

troyfeng116 commented 3 years ago

Yeah that's weird, try running black . from the root directory of SummerTime, it will rewrite any files with weird indent/space problems like that. We should probably log what to do if style checks fail, my bad

zhangir-azerbayev commented 3 years ago

Yeah that's weird, try running black . from the root directory of SummerTime, it will rewrite any files with weird indent/space problems like that. We should probably log what to do if style checks fail, my bad

I get this error message when I try to run black ..

error: cannot format /home/zhangir/lily/SummerTime/evaluation/model_selector.py: cannot use --safe with this file; failed to parse source file with Python 3.8's builtin AST. Re-run with --fast or stop using deprecated Python 2 syntax. AST error message: inconsistent use of tabs and spaces in indentation (<unknown>, line 36)
troyfeng116 commented 3 years ago

If there's a tab indent on line 36 try replacing with spaces, and use "indent with space" instead of "indent with tab" in your text editor

troyfeng116 commented 3 years ago

Or run black --fast . instead and double check the spacing looks ok

niansong1996 commented 3 years ago

@zhangir-azerbayev Have you fixed this yet? You should use spaces for indentation instead of tabs, as the display for tabs will be inconsistent over different text editors. As Troy said, if you want to use tabs, select "indent with space" in your text editor.

zhangir-azerbayev commented 3 years ago

@troyfeng116 Thank you for the help with black, I think I've sorted out the indentation issues by running black --fast. The tests are currently failing on the line from SummerTime.model.base_model import SummModel in the model selector unit test. The CI is complaining that the package SummerTime is not recognized. Is there a way to do this horizontal import in a way that passes the tests?

niansong1996 commented 3 years ago

For the import, try changing to from model.base_model import SummModel

zhangir-azerbayev commented 3 years ago

@troyfeng116 I'm close to being able to merge. The remaining problem is that there's an inconsistency between the Github CI and my local black. The CI says I should reform evaluation/error_viz.py and evaluation/plotutils/radar.py, but when I run black evaluation/error_viz.py evaluation/plotutils/radar.py no files are modified. Do you know what could be causing this?

troyfeng116 commented 3 years ago

@troyfeng116 I'm close to being able to merge. The remaining problem is that there's an inconsistency between the Github CI and my local black. The CI says I should reform evaluation/error_viz.py and evaluation/plotutils/radar.py, but when I run black evaluation/error_viz.py evaluation/plotutils/radar.py no files are modified. Do you know what could be causing this?

I pulled your branch and I also see two errors when I run black --check ., and running black on those two files worked for me. What does pip show black show for version? Lmk if you want me to push a commit with the reformatted files

zhangir-azerbayev commented 3 years ago

@troyfeng116 I'm close to being able to merge. The remaining problem is that there's an inconsistency between the Github CI and my local black. The CI says I should reform evaluation/error_viz.py and evaluation/plotutils/radar.py, but when I run black evaluation/error_viz.py evaluation/plotutils/radar.py no files are modified. Do you know what could be causing this?

I pulled your branch and I also see two errors when I run black --check ., and running black on those two files worked for me. What does pip show black show for version? Lmk if you want me to push a commit with the reformatted files

If you could push the commit, that would be great.

zhangir-azerbayev commented 3 years ago

@troyfeng116 Actually, I think I figured out what's going on. The version of black I installed from apt was 19.3b, whereas pip gave me 21.8b. The pip version reformatted the offending files; if the CI passes we should be ready to merge.

zhangir-azerbayev commented 3 years ago

@troyfeng116 The CI is currently giving me a very long error I'm not really sure how to parse. Do you understand what's going on?

troyfeng116 commented 3 years ago

@troyfeng116 The CI is currently giving me a very long error I'm not really sure how to parse. Do you understand what's going on?

Looks like Cython library not imported for some reason, not sure why that stopped working. @niansong1996 any ideas what caused this or how to fix this other than adding Cython to requirements.txt?

niansong1996 commented 3 years ago

I am trying to resolve these along with the pip install using the correct setup.py in #85

zhangir-azerbayev commented 3 years ago

@niansong1996 I think I am having the same error as in #85 where HMNet runs out of memory. Do you why this is happening in our branches but not in main?

zhangir-azerbayev commented 3 years ago

Passing CI, should be ready to merge.