aws-samples / sagemaker-101-workshop

Hands-on demonstrations for data scientists exploring Amazon SageMaker
76 stars 47 forks source link

Can't upgrade torchtext with newer PyTorch #29

Open athewsey opened 2 years ago

athewsey commented 2 years ago

Per the torchtext README, our current pinned torchtext version (0.6) is a long way out of sync with our PyTorch version (PTv1.8=TTv0.9, PTv1.10=TTv0.11).

I explored pinning the PT version to current and allowing pip to solve, with a statement like this:

!pip install torch==`pip show torch | grep 'Version:' | sed 's/Version: //'` torchtext

On the SMStudio PyTorch v1.10 CPU kernel, this installs the expected version of torchtext (0.11), but import torchtext fails due to missing symbols. Perhaps due to something missing from the CPU-optimized version of PyTorch?

So for now torchtext remains pinned at a pretty old version. We only use it for basic English text tokenization (util tokenize_and_pad_docs()), so maybe can switch to some other solution if this can't be resolved.