https://drive.google.com/uc?export=download&id=0BwmD_VLjROrfTHk4NFg2SndKcjQ&confirm=t
internal error: headers don't contain content-disposition. This is
usually caused by using a sharing/viewing link instead of a download
link. Click 'Download' on the Google Drive page, which should
redirect you to a download page, and use the link of that page.
This exception is thrown by iter of
GDriveReaderDataPipe(skip_on_error=False,
source_datapipe=OnDiskCacheHolderIterDataPipe, timeout=None)
Expected behavior
Looking at others with similar error messages makes it seem like there is some timeout issue retrieving from drive.google? So I went and got the cnn_stories.tgz and dailymail_stories.tgz and unpacked them:
How can I modify the calls retrieve from my local cache?
Environment
% python collect_env.py
Collecting environment information...
PyTorch version: 2.1.0.post100
Is debug build: False
CUDA used to build PyTorch: None
ROCM used to build PyTorch: N/A
OS: macOS 14.4.1 (arm64)
GCC version: Could not collect
Clang version: 15.0.0 (clang-1500.1.0.2.5)
CMake version: Could not collect
Libc version: N/A
Python version: 3.11.7 | packaged by conda-forge | (main, Dec 23 2023, 14:38:07) [Clang 16.0.6 ] (64-bit runtime)
Python platform: macOS-14.4.1-arm64-arm-64bit
Is CUDA available: False
CUDA runtime version: No CUDA
CUDA_MODULE_LOADING set to: N/A
GPU models and configuration: No CUDA
Nvidia driver version: No CUDA
cuDNN version: No CUDA
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True
š Bug
Describe the bug A clear and concise description of what the bug is.
Following the t5_demo, but when it tries to access the CNN data at
https://drive.google.com/uc?export=download&id=0BwmD_VLjROrfTHk4NFg2SndKcjQ
To Reproduce Steps to reproduce the behavior:
Get notebook at t5_demo,
Try to run it. It gets as far as
batch = next(iter(cnndm_dataloader))
(https://pytorch.org/text/stable/tutorials/t5_demo.html#generate-summaries) wherecnndm_datapipe = CNNDM(split="test")
(https://pytorch.org/text/stable/tutorials/t5_demo.html#datasets)Get error like:
Expected behavior
Looking at others with similar error messages makes it seem like there is some timeout issue retrieving from drive.google? So I went and got the
cnn_stories.tgz
anddailymail_stories.tgz
and unpacked them:How can I modify the calls retrieve from my local cache?
Environment
Additional context Add any other context about the problem here.