Importing Batch TorchText.Legacy versus Torchtext Failures

Seretsi commented 1 month ago

🐛 Describe the bug

!pip install torch==2.0.0+cu117 torchvision==0.15.1+cu117 torchaudio==2.0.1+cu117 --index-url https://download.pytorch.org/whl/cu117
 !pip uninstall -y torchtext
 !pip install torchtext==0.15.1
!pip install pytorch-lightning==1.5.0

import torchtext as tt
print(tt.__version__)
import pytorch_lightning as pl
print(pl.__version__)

I am getting the error below when I try to run code the needs pytorch_lightning. Version 1.5.0 should not be using torchtext.legacy.data but it keeps trying to and failing. I saw this logic in pytorch_lightning's apply_func.py code:

if _TORCHTEXT_AVAILABLE:
    if _compare_version("torchtext", operator.ge, "0.9.0"):
        from torchtext.legacy.data import Batch
    else:
        from torchtext.data import Batch
else:
    Batch = type(None)

This doesn't seem right if my understanding is correct. Isn't legacy only available for versions 0.9.0 and below?

---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
[<ipython-input-12-f83604beb67d>](https://localhost:8080/#) in <cell line: 2>()
      1 # !pip install torchtext==0.12.0
----> 2 import pytorch_lightning as pl
      3 print(pl.__version__)

4 frames
[/usr/local/lib/python3.10/dist-packages/pytorch_lightning/__init__.py](https://localhost:8080/#) in <module>
     18 _PROJECT_ROOT = os.path.dirname(_PACKAGE_ROOT)
     19 
---> 20 from pytorch_lightning.callbacks import Callback  # noqa: E402
     21 from pytorch_lightning.core import LightningDataModule, LightningModule  # noqa: E402
     22 from pytorch_lightning.trainer import Trainer  # noqa: E402

[/usr/local/lib/python3.10/dist-packages/pytorch_lightning/callbacks/__init__.py](https://localhost:8080/#) in <module>
     12 # See the License for the specific language governing permissions and
     13 # limitations under the License.
---> 14 from pytorch_lightning.callbacks.base import Callback
     15 from pytorch_lightning.callbacks.device_stats_monitor import DeviceStatsMonitor
     16 from pytorch_lightning.callbacks.early_stopping import EarlyStopping

[/usr/local/lib/python3.10/dist-packages/pytorch_lightning/callbacks/base.py](https://localhost:8080/#) in <module>
     24 
     25 import pytorch_lightning as pl
---> 26 from pytorch_lightning.utilities.types import STEP_OUTPUT
     27 
     28 

[/usr/local/lib/python3.10/dist-packages/pytorch_lightning/utilities/__init__.py](https://localhost:8080/#) in <module>
     16 import numpy
     17 
---> 18 from pytorch_lightning.utilities.apply_func import move_data_to_device  # noqa: F401
     19 from pytorch_lightning.utilities.distributed import AllGatherGrad, rank_zero_info, rank_zero_only  # noqa: F401
     20 from pytorch_lightning.utilities.enums import (  # noqa: F401

[/usr/local/lib/python3.10/dist-packages/pytorch_lightning/utilities/apply_func.py](https://localhost:8080/#) in <module>
     28 if _TORCHTEXT_AVAILABLE:
     29     if _compare_version("torchtext", operator.ge, "0.9.0"):
---> 30         from torchtext.legacy.data import Batch
     31     else:
     32         from torchtext.data import Batch

ModuleNotFoundError: No module named 'torchtext.legacy'

---------------------------------------------------------------------------
NOTE: If your import is failing due to a missing package, you can
manually install dependencies using either !pip or !apt.

To view examples of installing some common dependencies, click the
"Open Examples" button below.
---------------------------------------------------------------------------

Versions

Collecting environment information... PyTorch version: 2.0.0+cu117 Is debug build: False CUDA used to build PyTorch: 11.7 ROCM used to build PyTorch: N/A

OS: Ubuntu 22.04.3 LTS (x86_64) GCC version: (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 Clang version: 14.0.0-1ubuntu1.1 CMake version: version 3.27.9 Libc version: glibc-2.35

Python version: 3.10.12 (main, Nov 20 2023, 15:14:05) [GCC 11.4.0] (64-bit runtime) Python platform: Linux-6.1.85+-x86_64-with-glibc2.35 Is CUDA available: False CUDA runtime version: 12.2.140 CUDA_MODULE_LOADING set to: N/A GPU models and configuration: Could not collect Nvidia driver version: Could not collect cuDNN version: Probably one of the following: /usr/lib/x86_64-linux-gnu/libcudnn.so.8.9.6 /usr/lib/x86_64-linux-gnu/libcudnn_adv_infer.so.8.9.6 /usr/lib/x86_64-linux-gnu/libcudnn_adv_train.so.8.9.6 /usr/lib/x86_64-linux-gnu/libcudnn_cnn_infer.so.8.9.6 /usr/lib/x86_64-linux-gnu/libcudnn_cnn_train.so.8.9.6 /usr/lib/x86_64-linux-gnu/libcudnn_ops_infer.so.8.9.6 /usr/lib/x86_64-linux-gnu/libcudnn_ops_train.so.8.9.6 HIP runtime version: N/A MIOpen runtime version: N/A Is XNNPACK available: True

CPU: Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Address sizes: 46 bits physical, 48 bits virtual Byte Order: Little Endian CPU(s): 2 On-line CPU(s) list: 0,1 Vendor ID: GenuineIntel Model name: Intel(R) Xeon(R) CPU @ 2.20GHz CPU family: 6 Model: 79 Thread(s) per core: 2 Core(s) per socket: 1 Socket(s): 1 Stepping: 0 BogoMIPS: 4399.99 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single ssbd ibrs ibpb stibp fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm rdseed adx smap xsaveopt arat md_clear arch_capabilities Hypervisor vendor: KVM Virtualization type: full L1d cache: 32 KiB (1 instance) L1i cache: 32 KiB (1 instance) L2 cache: 256 KiB (1 instance) L3 cache: 55 MiB (1 instance) NUMA node(s): 1 NUMA node0 CPU(s): 0,1 Vulnerability Gather data sampling: Not affected Vulnerability Itlb multihit: Not affected Vulnerability L1tf: Mitigation; PTE Inversion Vulnerability Mds: Vulnerable; SMT Host state unknown Vulnerability Meltdown: Vulnerable Vulnerability Mmio stale data: Vulnerable Vulnerability Reg file data sampling: Not affected Vulnerability Retbleed: Vulnerable Vulnerability Spec rstack overflow: Not affected Vulnerability Spec store bypass: Vulnerable Vulnerability Spectre v1: Vulnerable: __user pointer sanitization and usercopy barriers only; no swapgs barriers Vulnerability Spectre v2: Vulnerable; IBPB: disabled; STIBP: disabled; PBRSB-eIBRS: Not affected; BHI: Vulnerable (Syscall hardening enabled) Vulnerability Srbds: Not affected Vulnerability Tsx async abort: Vulnerable

Versions of relevant libraries: [pip3] numpy==1.25.2 [pip3] pytorch-lightning==1.5.0 [pip3] torch==2.0.0+cu117 [pip3] torchaudio==2.0.1+cu117 [pip3] torchdata==0.6.0 [pip3] torchmetrics==1.4.0.post0 [pip3] torchsummary==1.5.1 [pip3] torchtext==0.15.1 [pip3] torchvision==0.15.1+cu117 [pip3] triton==2.0.0 [conda] Could not collect

cpuhrsch commented 1 month ago

@Seretsi - Have you considered opening an issue on pytorch/text?

Seretsi commented 1 month ago

I have not considered. Looks like its there. Not sure I did it.

pytorch / text

Importing Batch TorchText.Legacy versus Torchtext Failures #2266

🐛 Describe the bug

Versions