linkedin / detext

DeText: A Deep Neural Text Understanding Framework for Ranking and Classification Tasks
BSD 2-Clause "Simplified" License
1.26k stars 133 forks source link

A quick fix to remove non-cnn filter logic in misc_utils.py which bre… #28

Closed zhoutong-fu closed 4 years ago

zhoutong-fu commented 4 years ago

Description

Fixed a bug introduced in PR #25 related to the CNN filter window size. Without this fix, the argument parsing will fail for non-CNN models.

Fixes # (issue)

Type of change

Please delete options that are not relevant.

List all changes

Please list all changes in the commit.

Testing

flake8 for styles pytest for unit test sh run_detext.sh sh run_detext_multitask.sh

Test Configuration:

Checklist

StarWang commented 4 years ago

LGTM. Thanks for the fixes!

jakiejj commented 4 years ago

I wonder what the "bug" was introduced by pr 25? The removed code is not part of pr 25, and diff is not a choice according to the original add_argument help "inner/hadamard/concat or any combination of them separated by comma.")

zhoutong-fu commented 4 years ago

I wonder what the "bug" was introduced by pr 25? The removed code is not part of pr 25, and diff is not a choice according to the original add_argument help "inner/hadamard/concat or any combination of them separated by comma.") @jakiejj it's the lines I removed in misc_utils.py as they overrode the window size to '0' for non-CNN models and convert it back to str. Previously the str to int conversion was down in the same place.

jakiejj commented 4 years ago

it's the lines I removed in misc_utils.py as they overrode the window size to '0' for non-CNN models and convert it back to str. Previously the str to int conversion was down in the same place.

Thanks! in that case, shouldn't the code be changed to

    if hparams.ftr_ext != 'cnn':
        hparams.filter_window_sizes = [0]

Also, can also please help change the default values of num_hidden and filter_window_sizes to lists while you are at it? Thanks!

There really should be some tests cover those things.

zhoutong-fu commented 4 years ago

in that case, shouldn't the code be changed to

    if hparams.ftr_ext != 'cnn':
        hparams.filter_window_sizes = [0]

Also, can also please help change the default values of num_hidden and filter_window_sizes to lists while you are at it? Thanks!

I removed those two lines because in the train.py file it has been overridden as

cnn_filter_window_size=max(hparams.filter_window_sizes) if hparams.ftr_ext == 'cnn' else 0

you can find more detail here

For num_hidden and filter_window_sizes default values I saw you already set them as lists (code)

jakiejj commented 4 years ago

in that case, shouldn't the code be changed to

    if hparams.ftr_ext != 'cnn':
        hparams.filter_window_sizes = [0]

Also, can also please help change the default values of num_hidden and filter_window_sizes to lists while you are at it? Thanks!

I removed those two lines because in the train.py file it has been overridden as

cnn_filter_window_size=max(hparams.filter_window_sizes) if hparams.ftr_ext == 'cnn' else 0

you can find more detail here

For num_hidden and filter_window_sizes default values I saw you already set them as lists (code)

I meant num_hidden: List[int] = [0] etc.

the logic for filter_window_size processing is scattered to too many places. This is error-prone and they should be all moved to close to the parser.