Azure / blobxfer

Azure Storage transfer tool and data movement library
MIT License
151 stars 38 forks source link

Download broken under single object concurrency control #89

Closed veonua closed 5 years ago

veonua commented 5 years ago

Problem Description

2018-12-23 21:19:27.105 INFO - blobxfer start time: 2018-12-23 21:19:27.105204+00:00
2018-12-23 21:19:27.138 DEBUG - dest is_dir=False for 1 specs
2018-12-23 21:19:27.139 INFO - downloading blobs/files to local path: /mnt/batch/tasks/workitems/ocrjcdssadww/job-1/task-00012/wd/4690210_1.jpe
2018-12-23 21:19:27.139 DEBUG - spawning 3 transfer threads
2018-12-23 21:19:27.167 DEBUG - spawning 4 disk threads
2018-12-23 21:19:27.439 INFO - MD5: SKIPPED, test/DN/invoices/985/4690002_1.tif None <L..R> None
2018-12-23 21:19:27.571 INFO - MD5: SKIPPED, test/DN/invoices/985/4690002_1.tif.txt None <L..R> None
2018-12-23 21:19:27.583 INFO - MD5: SKIPPED, test/DN/invoices/985/4690003_1.tif None <L..R> None
2018-12-23 21:19:27.737 DEBUG - 0 files 0.0000 MiB filesize and/or lmt_ge skipped
2018-12-23 21:19:27.737 DEBUG - 35 remote files processed, waiting for download completion of approx. 2.1577 MiB
2018-12-23 21:19:27.738 ERROR - exceptions encountered while downloading
2018-12-23 21:19:27.738 ERROR - PosixPath('/mnt/batch/tasks/workitems/ocrjcdssadww/job-1/task-00012/wd/4690210_1.jpe')
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/blobxfer-1.5.5-py3.6.egg/blobxfer/operations/download.py", line 873, in start
    self._run()
  File "/usr/lib/python3.6/site-packages/blobxfer-1.5.5-py3.6.egg/blobxfer/operations/download.py", line 833, in _run
    raise self._exceptions[0]
  File "/usr/lib/python3.6/site-packages/blobxfer-1.5.5-py3.6.egg/blobxfer/operations/download.py", line 494, in _worker_thread_transfer
    self._process_download_descriptor(dd)
  File "/usr/lib/python3.6/site-packages/blobxfer-1.5.5-py3.6.egg/blobxfer/operations/download.py", line 584, in _process_download_descriptor
    self._transfer_cc[dd.final_path] -= 1
KeyError: PosixPath('/mnt/batch/tasks/workitems/ocrjcdssadww/job-1/task-00012/wd/4690210_1.jpe')

Steps to Reproduce

azure_storage:
            storage_account_settings: mystorageaccount
            remote_path: test
            is_file_share: true
            include:
            - '*_1.jpe'
alfpark commented 5 years ago

Cross-link to issue https://github.com/Azure/batch-shipyard/issues/255