huggingface / transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
https://huggingface.co/transformers
Apache License 2.0
135.72k stars 27.17k forks source link

Can't create transformer pipeline because pytorch failed to be detected #31454

Closed dannikay closed 4 months ago

dannikay commented 5 months ago

System Info

Ubuntu 22.04 Python 3.12.3

Who can help?

@Narsil @zucchini-nlp

Information

Tasks

Reproduction

import transformers
from transformers import is_torch_available
import torch
print(torch.__version__)
print(is_torch_available())

# Define the task that we want to use (required for proper pipeline construction)
task = "text2text-generation"

# Define the pipeline, using the task and a model instance that is applicable for our task.
generation_pipeline = transformers.pipeline(
    task=task,
    model="declare-lab/flan-alpaca-large",
)

# Define a simple input example that will be recorded with the model in MLflow, giving
# users of the model an indication of the expected input format.
input_example = ["prompt 1", "prompt 2", "prompt 3"]

# Define the parameters (and their defaults) for optional overrides at inference time.
parameters = {"max_length": 512, "do_sample": True, "temperature": 0.4}
2.3.1+cu121
False

...

RuntimeError: At least one of TensorFlow 2.0 or PyTorch should be installed. To install TensorFlow 2.0, read the instructions at https://www.tensorflow.org/install/ To install PyTorch, read the instructions at https://pytorch.org/.

Expected behavior

I do not expect the above error since pytorch has been installed in my system.

amyeroberts commented 5 months ago

cc @ydshieh

dannikay commented 5 months ago

I can no longer reproduce this after restarting my notebook kernel. I suppose that the pytorch detection relies on some system environment which requires notebook kernel to restart to pickup. Feel free to close this.

dannikay commented 5 months ago

It seems that _torch_available is a global variable and set at initialization time: https://github.com/huggingface/transformers/blob/02300273e220932a449a47ebbe453e7789be454b/src/transformers/utils/import_utils.py#L180C1-L180C17

So it won't be re-evaluated in the same notebook kernel when it was re-evaluated (until I restart my kernel). One improvement I can think of is to reevaluate this variable in is_torch_available() but I don't know if that breaks other things or not.

amyeroberts commented 5 months ago

So it won't be re-evaluated in the same notebook kernel when it was re-evaluated (until I restart my kernel).

Does this mean that torch is installed in the notebook after importing transformers?

ydshieh commented 5 months ago

In general, for notebook (well, I usually use Google Colab), if there is/are some installation(s), it is recommended to restart the kernel.

reevaluate this variable in is_torch_available()

We make use of _torch_available is (probably) and similar ideas for many other is_xxx_available is we try to avoid re-evaluation which might slowdown a lot.

I am open to a PR that improve stuff (the mentioned issue) but keep things fast (regarding what I mentioned above)

dannikay commented 5 months ago

@amyeroberts torch is installed in the notebook kernel ("!pip install ...") and I re-executed the block to import transformer multiple times (with no avail).

amyeroberts commented 5 months ago

orch is installed in the notebook kernel ("!pip install ...") and I re-executed the block to import transformer multiple times (with no avail).

@dannikay Just to be clear, this means that transformers was already imported, torch installed, then the cell to import transformers re-executed?

As @ydshieh notes above, we have a cache which means is_torch_available is executed once and its result stored within a python session. This helps speed things up within the transformers library - we have lots of is_xxx_available flags which enable us to safely guard for different framework and modality usage e.g. PyTorch vs TensorFlow.

If you or anyone else wants to submit a PR which would make this more dynamic whilst maintaining speed, we'd be very happy to review!

dannikay commented 5 months ago

@amyeroberts correct.

github-actions[bot] commented 4 months ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.