Open dannikay opened 2 weeks ago
cc @ydshieh
I can no longer reproduce this after restarting my notebook kernel. I suppose that the pytorch detection relies on some system environment which requires notebook kernel to restart to pickup. Feel free to close this.
It seems that _torch_available is a global variable and set at initialization time: https://github.com/huggingface/transformers/blob/02300273e220932a449a47ebbe453e7789be454b/src/transformers/utils/import_utils.py#L180C1-L180C17
So it won't be re-evaluated in the same notebook kernel when it was re-evaluated (until I restart my kernel). One improvement I can think of is to reevaluate this variable in is_torch_available() but I don't know if that breaks other things or not.
So it won't be re-evaluated in the same notebook kernel when it was re-evaluated (until I restart my kernel).
Does this mean that torch
is installed in the notebook after importing transformers?
In general, for notebook (well, I usually use Google Colab), if there is/are some installation(s), it is recommended to restart the kernel.
reevaluate this variable in is_torch_available()
We make use of _torch_available
is (probably) and similar ideas for many other is_xxx_available
is we try to avoid re-evaluation which might slowdown a lot.
I am open to a PR that improve stuff (the mentioned issue) but keep things fast (regarding what I mentioned above)
@amyeroberts torch is installed in the notebook kernel ("!pip install ...") and I re-executed the block to import transformer multiple times (with no avail).
orch is installed in the notebook kernel ("!pip install ...") and I re-executed the block to import transformer multiple times (with no avail).
@dannikay Just to be clear, this means that transformers was already imported, torch installed, then the cell to import transformers re-executed?
As @ydshieh notes above, we have a cache which means is_torch_available
is executed once and its result stored within a python session. This helps speed things up within the transformers library - we have lots of is_xxx_available
flags which enable us to safely guard for different framework and modality usage e.g. PyTorch vs TensorFlow.
If you or anyone else wants to submit a PR which would make this more dynamic whilst maintaining speed, we'd be very happy to review!
@amyeroberts correct.
System Info
Ubuntu 22.04 Python 3.12.3
Who can help?
@Narsil @zucchini-nlp
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
Expected behavior
I do not expect the above error since pytorch has been installed in my system.