QData / TextAttack

TextAttack 🐙 is a Python framework for adversarial attacks, data augmentation, and model training in NLP https://textattack.readthedocs.io/en/master/
https://textattack.readthedocs.io/en/master/
MIT License
2.98k stars 397 forks source link

Run demo error #796

Closed nextdoorUncleLiu closed 4 months ago

nextdoorUncleLiu commented 4 months ago

Describe the bug https://textattack.readthedocs.io/en/latest/0_get_started/command_line_usage.html I followed the demo in this link, but unfortunately an error message appeared in the first step.

To Reproduce Steps to reproduce the behavior:

  1. add examples.csv
  2. Run command line:
    textattack-test % textattack augment --input-csv examples.csv --output-csv output.csv  --input-column text --recipe eda --pct-words-to-swap .1 \
    --transformations-per-example 2 --exclude-original

Expected behavior It runs normally as the official website case

Screenshots or Traceback image

System Information (please complete the following information):

bterrific2008 commented 4 months ago

Looks like the issue has to do with the EDA Augmentation recipe. The EDA constructor doesn't have a catch for keyword arguments which causes the TypeError when high_yield is passed to that recipe.

I think the fix would be passing the keywords to the EDA recipe's composite augmenters. Something like this:

def __init__(self, pct_words_to_swap=0.1, transformations_per_example=4, **kwargs):
        assert 0.0 <= pct_words_to_swap <= 1.0, "pct_words_to_swap must be in [0., 1.]"
        assert (
            transformations_per_example > 0
        ), "transformations_per_example must be a positive integer"
        self.pct_words_to_swap = pct_words_to_swap
        self.transformations_per_example = transformations_per_example
        n_aug_each = max(transformations_per_example // 4, 1)

        self.synonym_replacement = WordNetAugmenter(
            pct_words_to_swap=pct_words_to_swap,
            transformations_per_example=n_aug_each,
            **kwargs,
        )
        self.random_deletion = DeletionAugmenter(
            pct_words_to_swap=pct_words_to_swap,
            transformations_per_example=n_aug_each,
            **kwargs,
        )
        self.random_swap = SwapAugmenter(
            pct_words_to_swap=pct_words_to_swap,
            transformations_per_example=n_aug_each,
            **kwargs,
        )
        self.random_insertion = SynonymInsertionAugmenter(
            pct_words_to_swap=pct_words_to_swap,
            transformations_per_example=n_aug_each,
            **kwargs,
        )

This should resolve the issue. I'll test it out later this week and open a PR if all looks good.

aemartinez commented 4 months ago

Same issue here. Changing --recipe eda for --recipe embedding ran successfully.

OS: macOS 14.5 python: 3.11.9 Textattack Version: 0.3.10

nextdoorUncleLiu commented 4 months ago

相同的问题。更改--recipe eda--recipe embedding已成功运行。

操作系统:macOS 14.5 python:3.11.9 Textattack版本:0.3.10

Great, how did you discover and solve this problem

jxmorris12 commented 4 months ago

Should be fixed thanks to @bterrific2008. Need to pull and install from source (not on pypi yet).