huggingface / transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
https://huggingface.co/transformers
Apache License 2.0
135.19k stars 27.06k forks source link

Mutable default value for `model_kwargs` in `pipeline` function #14292

Closed willfrey closed 3 years ago

willfrey commented 3 years ago

https://github.com/huggingface/transformers/blob/a14d62b0b11e6cb0c64059606e4f8dbf78e40e41/src/transformers/pipelines/__init__.py#L311

Is this intentional? I noticed the default is an empty dict but inspecting the signature in an interactive session showed that the dictionary was mutated:

Signature:
trf.pipeline(
    task: str,
    model: Optional = None,
    config: Union[str, transformers.configuration_utils.PretrainedConfig, NoneType] = None,
    tokenizer: Union[str, transformers.tokenization_utils.PreTrainedTokenizer, NoneType] = None,
    feature_extractor: Union[str, ForwardRef('SequenceFeatureExtractor'), NoneType] = None,
    framework: Optional[str] = None,
    revision: Optional[str] = None,
    use_fast: bool = True,
    use_auth_token: Union[bool, str, NoneType] = None,
    model_kwargs: Dict[str, Any] = {'use_auth_token': None},
    **kwargs,
) -> transformers.pipelines.base.Pipeline

This can cause some headaches because the default value references the same dict object all the time.

LysandreJik commented 3 years ago

cc @Narsil

Narsil commented 3 years ago

Definitely not intentional, it's pretty bad as you mention. Opened a PR for this.

Narsil commented 3 years ago

@LysandreJik pylint is able to detect those and many are found throughout the library.

Is it something we want to start checking automatically and enforcing not to have, they are more likely to hurt than to help. Didn't put everything here

src/transformers/debug_utils.py:136:4: W0102: Dangerous default value [] as argument (dangerous-default-value)
src/transformers/models/tapas/configuration_tapas.py:145:4: W0102: Dangerous default value [] as argument (dangerous-default-value)
src/transformers/models/beit/configuration_beit.py:111:4: W0102: Dangerous default value [] as argument (dangerous-default-value)
src/transformers/models/beit/configuration_beit.py:111:4: W0102: Dangerous default value [] as argument (dangerous-default-value)
^[:src/transformers/models/camembert/tokenization_camembert.py:113:4: W0102: Dangerous default value [] as argument (dangerous-default-value)
src/transformers/models/camembert/tokenization_camembert_fast.py:106:4: W0102: Dangerous default value [] as argument (dangerous-default-value)
src/transformers/models/gpt_neo/configuration_gpt_neo.py:100:4: W0102: Dangerous default value [] as argument (dangerous-default-value)
src/transformers/models/fsmt/configuration_fsmt.py:130:4: W0102: Dangerous default value [] as argument (dangerous-default-value)
src/transformers/models/reformer/configuration_reformer.py:163:4: W0102: Dangerous default value [] as argument (dangerous-default-value)
src/transformers/models/reformer/configuration_reformer.py:163:4: W0102: Dangerous default value [] as argument (dangerous-default-value)
src/transformers/models/reformer/configuration_reformer.py:163:4: W0102: Dangerous default value [] as argument (dangerous-default-value)
src/transformers/models/reformer/tokenization_reformer.py:92:4: W0102: Dangerous default value [] as argument (dangerous-default-value)
src/transformers/models/reformer/tokenization_reformer_fast.py:88:4: W0102: Dangerous default value [] as argument (dangerous-default-value)
src/transformers/models/segformer/configuration_segformer.py:100:4: W0102: Dangerous default value [] as argument (dangerous-default-value)
src/transformers/models/segformer/configuration_segformer.py:100:4: W0102: Dangerous default value [] as argument (dangerous-default-value)
src/transformers/models/segformer/configuration_segformer.py:100:4: W0102: Dangerous default value [] as argument (dangerous-default-value)
src/transformers/models/segformer/configuration_segformer.py:100:4: W0102: Dangerous default value [] as argument (dangerous-default-value)
src/transformers/models/segformer/configuration_segformer.py:100:4: W0102: Dangerous default value [] as argument (dangerous-default-value)
src/transformers/models/segformer/configuration_segformer.py:100:4: W0102: Dangerous default value [] as argument (dangerous-default-value)
src/transformers/models/segformer/configuration_segformer.py:100:4: W0102: Dangerous default value [] as argument (dangerous-default-value)
src/transformers/models/segformer/configuration_segformer.py:100:4: W0102: Dangerous default value [] as argument (dangerous-default-value)
src/transformers/models/marian/convert_marian_to_pytorch.py:206:0: W0102: Dangerous default value {} as argument (dangerous-default-value)
src/transformers/models/funnel/configuration_funnel.py:110:4: W0102: Dangerous default value [] as argument (dangerous-default-value)
src/transformers/models/layoutlmv2/tokenization_layoutlmv2.py:165:4: W0102: Dangerous default value [] as argument (dangerous-default-value)
src/transformers/models/layoutlmv2/tokenization_layoutlmv2.py:165:4: W0102: Dangerous default value [] as argument (dangerous-default-value)
src/transformers/models/layoutlmv2/tokenization_layoutlmv2.py:165:4: W0102: Dangerous default value [] as argument (dangerous-default-value)
src/transformers/models/layoutlmv2/tokenization_layoutlmv2_fast.py:114:4: W0102: Dangerous default value [] as argument (dangerous-default-value)
src/transformers/models/layoutlmv2/tokenization_layoutlmv2_fast.py:114:4: W0102: Dangerous default value [] as argument (dangerous-default-value)
src/transformers/models/layoutlmv2/tokenization_layoutlmv2_fast.py:114:4: W0102: Dangerous default value [] as argument (dangerous-default-value)
src/transformers/models/layoutlmv2/configuration_layoutlmv2.py:120:4: W0102: Dangerous default value [] as argument (dangerous-default-value)
src/transformers/models/transfo_xl/configuration_transfo_xl.py:116:4: W0102: Dangerous default value [] as argument (dangerous-default-value)
src/transformers/models/transfo_xl/tokenization_transfo_xl.py:156:4: W0102: Dangerous default value [] as argument (dangerous-default-value)
src/transformers/models/layoutxlm/tokenization_layoutxlm.py:126:4: W0102: Dangerous default value [] as argument (dangerous-default-value)
src/transformers/models/layoutxlm/tokenization_layoutxlm.py:126:4: W0102: Dangerous default value [] as argument (dangerous-default-value)
src/transformers/models/layoutxlm/tokenization_layoutxlm.py:126:4: W0102: Dangerous default value [] as argument (dangerous-default-value)
src/transformers/models/layoutxlm/tokenization_layoutxlm_fast.py:115:4: W0102: Dangerous default value [] as argument (dangerous-default-value)
src/transformers/models/layoutxlm/tokenization_layoutxlm_fast.py:115:4: W0102: Dangerous default value [] as argument (dangerous-default-value)
src/transformers/models/layoutxlm/tokenization_layoutxlm_fast.py:115:4: W0102: Dangerous default value [] as argument (dangerous-default-value)
src/transformers/models/visual_bert/convert_visual_bert_original_pytorch_checkpoint_to_pytorch.py:64:0: W0102: Dangerous default value rename_keys_prefix (builtins.list) as argument (dangerous-default-value)
src/transformers/models/xlnet/tokenization_xlnet.py:127:4: W0102: Dangerous default value [] as argument (dangerous-default-value)
src/transformers/models/xlnet/tokenization_xlnet_fast.py:125:4: W0102: Dangerous default value [] as argument (dangerous-default-value)