StanfordVL / taskonomy

Taskonomy: Disentangling Task Transfer Learning [Best Paper, CVPR2018]
https://taskonomy.vision
MIT License
844 stars 146 forks source link

Download fails due to "unexpected end of data" #53

Closed ArashAkbarinia closed 2 years ago

ArashAkbarinia commented 2 years ago

Thank you in advance for your help :)

My attempt to download the dataset fails after 7%.

Executing this command: sudo apt-get install aria2 pip install omnidata-tools omnitools.download all --components taskonomy --subset fullplus \ --dest ./taskonomy_dataset/ \ --connections_total 40 --agree

The error line: [FAILURE] Failure when processing model https://datasets.epfl.ch/taskonomy/ballou_class_scene.tar

Full-stack of error:

multiprocess.pool.RemoteTraceback:
""" Traceback (most recent call last):
File "/shared/venvs/py3.8-torch1.7.1/lib/python3.8/site-packages/multiprocess/pool.py", line 125, in worker result = (True, func(*args, **kwds))
File "/shared/venvs/py3.8-torch1.7.1/lib/python3.8/site-packages/omnidata_tools/dataset/download.py", line 257, in process_model raise e
File "/shared/venvs/py3.8-torch1.7.1/lib/python3.8/site-packages/omnidata_tools/dataset/download.py", line 253, in process_model untar(tar_fpath, dest=dest, model=model, ignore_existing=ignore_existing, dryrun=dryrun, output_structure=output_structure) File "/shared/venvs/py3.8-torch1.7.1/lib/python3.8/site-packages/omnidata_tools/dataset/download.py", line 177, in untar tar.extractall(path=tmpdirname) File "/usr/lib/python3.8/tarfile.py", line 2026, in extractall self.extract(tarinfo, path, set_attrs=not tarinfo.isdir(),
File "/usr/lib/python3.8/tarfile.py", line 2067, in extract
self._extract_member(tarinfo, os.path.join(path, tarinfo.name),
File "/usr/lib/python3.8/tarfile.py", line 2139, in _extract_member self.makefile(tarinfo, targetpath)
File "/usr/lib/python3.8/tarfile.py", line 2188, in makefile copyfileobj(source, target, tarinfo.size, ReadError, bufsize)
File "/usr/lib/python3.8/tarfile.py", line 255, in copyfileobj
raise exception("unexpected end of data")
tarfile.ReadError: unexpected end of data
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/shared/venvs/py3.8-torch1.7.1/bin/omnitools.download", line 8, in sys.exit(download()) File "/shared/venvs/py3.8-torch1.7.1/lib/python3.8/site-packages/fastcore/script.py", line 112, in _f tfunc(merge(args, args_from_prog(func, xtra))) File "/shared/venvs/py3.8-torch1.7.1/lib/python3.8/site-packages/omnidata_tools/dataset/download.py", line 263, in download r = list(tqdm.tqdm(p.imap(process_model, models), total=len(models))) File "/shared/venvs/py3.8-torch1.7.1/lib/python3.8/site-packages/tqdm/std.py", line 1178, in iter for obj in iterable:
File "/shared/venvs/py3.8-torch1.7.1/lib/python3.8/site-packages/multiprocess/pool.py", line 868, in next raise value
tarfile.ReadError: unexpected end of data**

alexsax commented 2 years ago

Thanks for flagging this and posting the stack trace. Looks like that particular file is corrupted (on both Stanford and EPFL servers). I suggest just dropping that building's class_scene labels. class_scene is just the output of a classification model, anyway, so they're not that important or hard to reproduce.

I've pushed a new version of the download tool that will flag but not halt on these types of errors. I suggest upgrading the tool and trying again :)