huggingface / transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
https://huggingface.co/transformers
Apache License 2.0
135.76k stars 27.18k forks source link

Fine tunning Bloom model - Failed to import transformers.training_args #24631

Closed seema-AIML closed 1 year ago

seema-AIML commented 1 year ago

System Info

falcon-7b-instruct(url)

Who can help?

No response

Information

Tasks

Reproduction

from transformers import pipeline

sequences = pipeline( "Write a poem about Valencia.", max_length=200, do_sample=True, top_k=10, num_return_sequences=1, eos_token_id=tokenizer.eos_token_id, ) for seq in sequences: print(f"Result: {seq['generated_text']}")

Expected behavior

Hi,

while running transformers API models on local machine facing issue of, Failed to import transformers.pipelines because of the following error (look up to see its traceback): module 'numpy' has no attribute 'object'. np.object was a deprecated alias for the builtin object. How to fix this?

ydshieh commented 1 year ago

Hi @seema-AIML

Could you post the full trace log, please. Thank you in advance.

seema-AIML commented 1 year ago

from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("bigscience/bloom-560m") def tokenize_function(examples): return tokenizer(examples["text"], padding="max_length", truncation=True) tokenized_datasets = dataset.map(tokenize_function, batched=True)

from transformers import AutoModelForSequenceClassification model = AutoModelForSequenceClassification.from_pretrained("bigscience/bloom-560m", num_labels=5)

from builtins import object from transformers import TrainingArguments training_args = TrainingArguments(output_dir="test_trainer")

While creating TrainingArguments getting below error


AttributeError Traceback (most recent call last) ~\Anaconda3\lib\site-packages\transformers\utils\import_utils.py in _get_module(self, module_name) 1125 try: -> 1126 return importlib.import_module("." + module_name, self.name) 1127 except Exception as e:

~\Anaconda3\lib\importlib__init__.py in import_module(name, package) 126 level += 1 --> 127 return _bootstrap._gcd_import(name[level:], package, level) 128

~\Anaconda3\lib\importlib_bootstrap.py in _gcd_import(name, package, level)

~\Anaconda3\lib\importlib_bootstrap.py in _find_andload(name, import)

~\Anaconda3\lib\importlib_bootstrap.py in _find_and_loadunlocked(name, import)

~\Anaconda3\lib\importlib_bootstrap.py in _load_unlocked(spec)

~\Anaconda3\lib\importlib_bootstrap_external.py in exec_module(self, module)

~\Anaconda3\lib\importlib_bootstrap.py in _call_with_frames_removed(f, *args, **kwds)

~\Anaconda3\lib\site-packages\transformers\training_args.py in 29 from .debug_utils import DebugOption ---> 30 from .trainer_utils import ( 31 EvaluationStrategy,

~\Anaconda3\lib\site-packages\transformers\trainer_utils.py in 46 if is_tf_available(): ---> 47 import tensorflow as tf 48

~\Anaconda3\lib\site-packages\tensorflow__init__.py in 40 ---> 41 from tensorflow.python.tools import module_util as _module_util 42 from tensorflow.python.util.lazy_loader import LazyLoader as _LazyLoader

~\Anaconda3\lib\site-packages\tensorflow\python__init__.py in 45 # Bring in subpackages. ---> 46 from tensorflow.python import data 47 from tensorflow.python import distribute

~\Anaconda3\lib\site-packages\tensorflow\python\data__init__.py in 24 # pylint: disable=unused-import ---> 25 from tensorflow.python.data import experimental 26 from tensorflow.python.data.ops.dataset_ops import AUTOTUNE

~\Anaconda3\lib\site-packages\tensorflow\python\data\experimental__init__.py in 96 # pylint: disable=unused-import ---> 97 from tensorflow.python.data.experimental import service 98 from tensorflow.python.data.experimental.ops.batching import dense_to_ragged_batch

~\Anaconda3\lib\site-packages\tensorflow\python\data\experimental\service__init__.py in 352 --> 353 from tensorflow.python.data.experimental.ops.data_service_ops import distribute 354 from tensorflow.python.data.experimental.ops.data_service_ops import from_dataset_id

~\Anaconda3\lib\site-packages\tensorflow\python\data\experimental\ops\data_service_ops.py in 25 from tensorflow.python.compat import compat ---> 26 from tensorflow.python.data.experimental.ops import compression_ops 27 from tensorflow.python.data.experimental.ops.distribute_options import AutoShardPolicy

~\Anaconda3\lib\site-packages\tensorflow\python\data\experimental\ops\compression_ops.py in 19 ---> 20 from tensorflow.python.data.util import structure 21 from tensorflow.python.ops import gen_experimental_dataset_ops as ged_ops

~\Anaconda3\lib\site-packages\tensorflow\python\data\util\structure.py in 25 ---> 26 from tensorflow.python.data.util import nest 27 from tensorflow.python.framework import composite_tensor

~\Anaconda3\lib\site-packages\tensorflow\python\data\util\nest.py in 39 ---> 40 from tensorflow.python.framework import sparse_tensor as _sparse_tensor 41 from tensorflow.python.util import _pywrap_utils

~\Anaconda3\lib\site-packages\tensorflow\python\framework\sparse_tensor.py in 27 from tensorflow.python.framework import composite_tensor ---> 28 from tensorflow.python.framework import constant_op 29 from tensorflow.python.framework import dtypes

~\Anaconda3\lib\site-packages\tensorflow\python\framework\constant_op.py in 28 from tensorflow.python.eager import context ---> 29 from tensorflow.python.eager import execute 30 from tensorflow.python.framework import dtypes

~\Anaconda3\lib\site-packages\tensorflow\python\eager\execute.py in 26 from tensorflow.python.eager import core ---> 27 from tensorflow.python.framework import dtypes 28 from tensorflow.python.framework import ops

~\Anaconda3\lib\site-packages\tensorflow\python\framework\dtypes.py in 584 types_pb2.DT_STRING: --> 585 np.object, 586 types_pb2.DT_COMPLEX64:

~\Anaconda3\lib\site-packages\numpy__init.py in getattr(attr) 304 if attr in former_attrs: --> 305 raise AttributeError(former_attrs__[attr]) 306

AttributeError: module 'numpy' has no attribute 'object'. np.object was a deprecated alias for the builtin object. To avoid this error in existing code, use object by itself. Doing this will not modify any behavior and is safe. The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations

The above exception was the direct cause of the following exception:


RuntimeError Traceback (most recent call last)

in 1 from builtins import object ----> 2 from transformers import TrainingArguments 3 4 training_args = TrainingArguments(output_dir="test_trainer") ~\Anaconda3\lib\importlib\_bootstrap.py in _handle_fromlist(module, fromlist, import_, recursive) ~\Anaconda3\lib\site-packages\transformers\utils\import_utils.py in __getattr__(self, name) 1114 value = self._get_module(name) 1115 elif name in self._class_to_module.keys(): -> 1116 module = self._get_module(self._class_to_module[name]) 1117 value = getattr(module, name) 1118 else: ~\Anaconda3\lib\site-packages\transformers\utils\import_utils.py in _get_module(self, module_name) 1126 return importlib.import_module("." + module_name, self.__name__) 1127 except Exception as e: -> 1128 raise RuntimeError( 1129 f"Failed to import {self.__name__}.{module_name} because of the following error (look up to see its" 1130 f" traceback):\n{e}" RuntimeError: Failed to import transformers.training_args because of the following error (look up to see its traceback): module 'numpy' has no attribute 'object'. `np.object` was a deprecated alias for the builtin `object`. To avoid this error in existing code, use `object` by itself. Doing this will not modify any behavior and is safe. The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations ____________________________________________________________________________________________________________________________ How to fix this?
ydshieh commented 1 year ago

The error occurs in tensorflow file.

~\Anaconda3\lib\site-packages\tensorflow\python\framework\dtypes.py in
584 types_pb2.DT_STRING:
--> 585 np.object,

If you don't need tensorflow, the quick way to check is to uninstall tensorflow and see if the issue is resolved. You can also try to create a new virtual environment, and install as pip install transformers[torch].

seema-AIML commented 1 year ago

created new a new virtual environment, and installed transformers[torch]. Still getting same error. I have not installed tensorflow in new virtual environment. when tried to uninstall tensorflow getting warning as WARNING: Skipping tensorflow as it is not installed.

ydshieh commented 1 year ago

Please provide the new full error log (the one that is run within the new environment).

seema-AIML commented 1 year ago

AttributeError Traceback (most recent call last) ~\Anaconda3\lib\site-packages\transformers\utils\import_utils.py in _get_module(self, module_name) 1125 try: -> 1126 return importlib.import_module("." + module_name, self.name) 1127 except Exception as e:

~\Anaconda3\lib\importlib__init__.py in import_module(name, package) 126 level += 1 --> 127 return _bootstrap._gcd_import(name[level:], package, level) 128

~\Anaconda3\lib\importlib_bootstrap.py in _gcd_import(name, package, level)

~\Anaconda3\lib\importlib_bootstrap.py in _find_andload(name, import)

~\Anaconda3\lib\importlib_bootstrap.py in _find_and_loadunlocked(name, import)

~\Anaconda3\lib\importlib_bootstrap.py in _load_unlocked(spec)

~\Anaconda3\lib\importlib_bootstrap_external.py in exec_module(self, module)

~\Anaconda3\lib\importlib_bootstrap.py in _call_with_frames_removed(f, *args, **kwds)

~\Anaconda3\lib\site-packages\transformers\training_args.py in 29 from .debug_utils import DebugOption ---> 30 from .trainer_utils import ( 31 EvaluationStrategy,

~\Anaconda3\lib\site-packages\transformers\trainer_utils.py in 46 if is_tf_available(): ---> 47 import tensorflow as tf 48

~\Anaconda3\lib\site-packages\tensorflow__init__.py in 40 ---> 41 from tensorflow.python.tools import module_util as _module_util 42 from tensorflow.python.util.lazy_loader import LazyLoader as _LazyLoader

~\Anaconda3\lib\site-packages\tensorflow\python__init__.py in 45 # Bring in subpackages. ---> 46 from tensorflow.python import data 47 from tensorflow.python import distribute

~\Anaconda3\lib\site-packages\tensorflow\python\data__init__.py in 24 # pylint: disable=unused-import ---> 25 from tensorflow.python.data import experimental 26 from tensorflow.python.data.ops.dataset_ops import AUTOTUNE

~\Anaconda3\lib\site-packages\tensorflow\python\data\experimental__init__.py in 96 # pylint: disable=unused-import ---> 97 from tensorflow.python.data.experimental import service 98 from tensorflow.python.data.experimental.ops.batching import dense_to_ragged_batch

~\Anaconda3\lib\site-packages\tensorflow\python\data\experimental\service__init__.py in 352 --> 353 from tensorflow.python.data.experimental.ops.data_service_ops import distribute 354 from tensorflow.python.data.experimental.ops.data_service_ops import from_dataset_id

~\Anaconda3\lib\site-packages\tensorflow\python\data\experimental\ops\data_service_ops.py in 25 from tensorflow.python.compat import compat ---> 26 from tensorflow.python.data.experimental.ops import compression_ops 27 from tensorflow.python.data.experimental.ops.distribute_options import AutoShardPolicy

~\Anaconda3\lib\site-packages\tensorflow\python\data\experimental\ops\compression_ops.py in 19 ---> 20 from tensorflow.python.data.util import structure 21 from tensorflow.python.ops import gen_experimental_dataset_ops as ged_ops

~\Anaconda3\lib\site-packages\tensorflow\python\data\util\structure.py in 25 ---> 26 from tensorflow.python.data.util import nest 27 from tensorflow.python.framework import composite_tensor

~\Anaconda3\lib\site-packages\tensorflow\python\data\util\nest.py in 39 ---> 40 from tensorflow.python.framework import sparse_tensor as _sparse_tensor 41 from tensorflow.python.util import _pywrap_utils

~\Anaconda3\lib\site-packages\tensorflow\python\framework\sparse_tensor.py in 27 from tensorflow.python.framework import composite_tensor ---> 28 from tensorflow.python.framework import constant_op 29 from tensorflow.python.framework import dtypes

~\Anaconda3\lib\site-packages\tensorflow\python\framework\constant_op.py in 28 from tensorflow.python.eager import context ---> 29 from tensorflow.python.eager import execute 30 from tensorflow.python.framework import dtypes

~\Anaconda3\lib\site-packages\tensorflow\python\eager\execute.py in 26 from tensorflow.python.eager import core ---> 27 from tensorflow.python.framework import dtypes 28 from tensorflow.python.framework import ops

~\Anaconda3\lib\site-packages\tensorflow\python\framework\dtypes.py in 584 types_pb2.DT_STRING: --> 585 np.object, 586 types_pb2.DT_COMPLEX64:

~\Anaconda3\lib\site-packages\numpy__init.py in getattr(attr) 304 if attr in former_attrs: --> 305 raise AttributeError(former_attrs__[attr]) 306

AttributeError: module 'numpy' has no attribute 'object'. np.object was a deprecated alias for the builtin object. To avoid this error in existing code, use object by itself. Doing this will not modify any behavior and is safe. The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations

The above exception was the direct cause of the following exception:

RuntimeError Traceback (most recent call last)

in ----> 1 from transformers import TrainingArguments 2 3 training_args = TrainingArguments(output_dir="test_trainer") ~\Anaconda3\lib\importlib\_bootstrap.py in _handle_fromlist(module, fromlist, import_, recursive) ~\Anaconda3\lib\site-packages\transformers\utils\import_utils.py in __getattr__(self, name) 1114 value = self._get_module(name) 1115 elif name in self._class_to_module.keys(): -> 1116 module = self._get_module(self._class_to_module[name]) 1117 value = getattr(module, name) 1118 else: ~\Anaconda3\lib\site-packages\transformers\utils\import_utils.py in _get_module(self, module_name) 1126 return importlib.import_module("." + module_name, self.__name__) 1127 except Exception as e: -> 1128 raise RuntimeError( 1129 f"Failed to import {self.__name__}.{module_name} because of the following error (look up to see its" 1130 f" traceback):\n{e}" RuntimeError: Failed to import transformers.training_args because of the following error (look up to see its traceback): module 'numpy' has no attribute 'object'. `np.object` was a deprecated alias for the builtin `object`. To avoid this error in existing code, use `object` by itself. Doing this will not modify any behavior and is safe. The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations Its same error
ydshieh commented 1 year ago

The error still shows tensorflow is in your environment.

Could you show us the results of transformers-cli env, pip show tensorflow and pip show tensorflow-cpu

seema-AIML commented 1 year ago

result of transformers-cli env

(hface) (base) D:>pip show tensorflow WARNING: Package(s) not found: tensorflow

(hface) (base) D:>pip show tensorflow-cpu WARNING: Package(s) not found: tensorflow-cpu

ydshieh commented 1 year ago

Hmm. The TF detection logic is in the following block.

https://github.com/huggingface/transformers/blob/cd4584e3c809bb9e1392ccd3fe38b40daba5519a/src/transformers/utils/import_utils.py#L144-L183

You env. might still have something listed in

https://github.com/huggingface/transformers/blob/cd4584e3c809bb9e1392ccd3fe38b40daba5519a/src/transformers/utils/import_utils.py#L155-L165

You can either check each of them and uninstall if they appear. Otherwise much easier, you can try to set the env. varialbe USE_TF to False, either by set USE_TF=0 or export USE_TF=0

seema-AIML commented 1 year ago

I have set USE_TF = 0

%env USE_TF=0 from transformers import AutoTokenizer, BartForConditionalGeneration, Trainer, TrainingArguments model = BartForConditionalGeneration.from_pretrained("facebook/bart-base") training_args = TrainingArguments( output_dir='./results', # output directory num_train_epochs=3, # total number of training epochs per_device_train_batch_size=16, # batch size per device during training per_device_eval_batch_size=64, # batch size for evaluation warmup_steps=500, # number of warmup steps for learning rate scheduler weight_decay=0.01, # strength of weight decay logging_dir='./logs', # directory for storing logs logging_steps=10, )

trainer = Trainer( model=model,
args=training_args,
train_dataset=train_dataset,
eval_dataset=val_dataset
) trainer.train()

Still same error

AttributeError Traceback (most recent call last)

in 17 18 ---> 19 trainer = Trainer( 20 model=model, # the instantiated Transformers model to be trained 21 args=training_args, # training arguments, defined above ~\Anaconda3\lib\site-packages\transformers\trainer.py in __init__(self, model, args, data_collator, train_dataset, eval_dataset, tokenizer, model_init, compute_metrics, callbacks, optimizers, preprocess_logits_for_metrics) 517 default_callbacks = DEFAULT_CALLBACKS + get_reporting_integration_callbacks(self.args.report_to) 518 callbacks = default_callbacks if callbacks is None else default_callbacks + callbacks --> 519 self.callback_handler = CallbackHandler( 520 callbacks, self.model, self.tokenizer, self.optimizer, self.lr_scheduler 521 ) ~\Anaconda3\lib\site-packages\transformers\trainer_callback.py in __init__(self, callbacks, model, tokenizer, optimizer, lr_scheduler) 294 self.callbacks = [] 295 for cb in callbacks: --> 296 self.add_callback(cb) 297 self.model = model 298 self.tokenizer = tokenizer ~\Anaconda3\lib\site-packages\transformers\trainer_callback.py in add_callback(self, callback) 311 312 def add_callback(self, callback): --> 313 cb = callback() if isinstance(callback, type) else callback 314 cb_class = callback if isinstance(callback, type) else callback.__class__ 315 if cb_class in [c.__class__ for c in self.callbacks]: ~\Anaconda3\lib\site-packages\transformers\integrations.py in __init__(self) 926 if not is_mlflow_available(): 927 raise RuntimeError("MLflowCallback requires mlflow to be installed. Run `pip install mlflow`.") --> 928 import mlflow 929 930 self._MAX_PARAM_VAL_LENGTH = mlflow.utils.validation.MAX_PARAM_VAL_LENGTH ~\Anaconda3\lib\site-packages\mlflow\__init__.py in 48 try: 49 # pylint: disable=unused-import ---> 50 import mlflow.catboost as catboost # noqa: E402 51 import mlflow.fastai as fastai # noqa: E402 52 import mlflow.gluon as gluon # noqa: E402 ~\Anaconda3\lib\site-packages\mlflow\catboost.py in 22 23 import mlflow ---> 24 from mlflow import pyfunc 25 from mlflow.models import Model, ModelInputExample 26 from mlflow.models.model import MLMODEL_FILE_NAME ~\Anaconda3\lib\site-packages\mlflow\pyfunc\__init__.py in 217 from typing import Any, Union, List, Dict 218 import mlflow --> 219 import mlflow.pyfunc.model 220 import mlflow.pyfunc.utils 221 from mlflow.models import Model, ModelSignature, ModelInputExample ~\Anaconda3\lib\site-packages\mlflow\pyfunc\model.py in 15 import mlflow.utils 16 from mlflow.exceptions import MlflowException ---> 17 from mlflow.models import Model 18 from mlflow.models.model import MLMODEL_FILE_NAME 19 from mlflow.protos.databricks_pb2 import INVALID_PARAMETER_VALUE ~\Anaconda3\lib\site-packages\mlflow\models\__init__.py in 24 from .model import Model 25 from .flavor_backend import FlavorBackend ---> 26 from .signature import ModelSignature, infer_signature 27 from .utils import ModelInputExample 28 from ..utils.environment import infer_pip_requirements ~\Anaconda3\lib\site-packages\mlflow\models\signature.py in 10 import numpy as np 11 ---> 12 from mlflow.types.schema import Schema 13 from mlflow.types.utils import _infer_schema 14 ~\Anaconda3\lib\site-packages\mlflow\types\__init__.py in 4 """ 5 ----> 6 from .schema import DataType, ColSpec, Schema, TensorSpec 7 8 __all__ = ["Schema", "ColSpec", "DataType", "TensorSpec"] ~\Anaconda3\lib\site-packages\mlflow\types\schema.py in 18 19 ---> 20 class DataType(Enum): 21 """ 22 MLflow data types. ~\Anaconda3\lib\site-packages\mlflow\types\schema.py in DataType() 47 string = (6, np.dtype("str"), "StringType", _pandas_string_type()) 48 """Text data.""" ---> 49 binary = (7, np.dtype("bytes"), "BinaryType", np.object) 50 """Sequence of raw bytes.""" 51 datetime = (8, np.dtype("datetime64"), "TimestampType") ~\Anaconda3\lib\site-packages\numpy\__init__.py in __getattr__(attr) 303 304 if attr in __former_attrs__: --> 305 raise AttributeError(__former_attrs__[attr]) 306 307 # Importing Tester requires importing all of UnitTest which is not a AttributeError: module 'numpy' has no attribute 'object'. `np.object` was a deprecated alias for the builtin `object`. To avoid this error in existing code, use `object` by itself. Doing this will not modify any behavior and is safe. The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations ​
ydshieh commented 1 year ago

Try to set report_to="none" in training_args = TrainingArguments. Your environment has mlflow installed which might use some deprecated numpy code. Or you can upgrade your mflow versions.

github-actions[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.