pytorch / text

Models, data loaders and abstractions for language processing, powered by PyTorch
https://pytorch.org/text
BSD 3-Clause "New" or "Revised" License
3.49k stars 815 forks source link

`pip install torchtext` is broken after torch 2.4.0 #2272

Open sagelywizard opened 2 weeks ago

sagelywizard commented 2 weeks ago

πŸ› Bug

torchtext is in maintenance mode, but there's a problem with the current dependencies which I think may warrant an update and minor version bump. This problem causes pip install torchtext to install a broken installation by default.

Summary

The problem is that the installing the most recent version of torchtext pulls in the latest version of torch (2.4.0) but it's incompatible with that version of torch. Looks like there's some shared object file which is referencing some symbol that was removed. Using torchtext 0.18.0 with torch 2.4.0 causes import torchtext to fail with the following error:

OSError: /usr/local/lib/python3.10/dist-packages/torchtext/lib/libtorchtext.so: undefined symbol: _ZN5torch3jit17parseSchemaOrNameERKSs

You can repro in a Colab notebook by:

  1. uninstalling torch/torchtext. !pip uninstall -y torch torchtext
  2. installing torchtext !pip install torchtext.
  3. importing torchtext import torchtext

Solution

AFAICT, there are two potential solutions:

  1. Update torchtext to be compatible with torch 2.4.0. I suspect this isn't on the table, since torchtext is in maintenance mode.
  2. Update torchtext to specify that it requires torch<2.4.0.

Either solution would require a bump in the minor version and a push to pypi.

atalman commented 2 weeks ago

cc @kartikayk