CartoDB / bigmetadata

BSD 3-Clause "New" or "Revised" License
43 stars 11 forks source link

Improve repository error handling #562

Closed juanignaciosl closed 6 years ago

juanignaciosl commented 6 years ago

STR:

  1. Remove a physical file from the repo path
  2. Run the related task.
  3. It will fail with an error similar to this:
2018-08-23 12:16:34,030 [ERROR]: [pid 34] Worker Worker(salt=550603168, workers=1, host=0f959ae14b3c, username=root, pid=34) failed    tasks.uk.cdrc.DownloadOutputAreas()
Traceback (most recent call last):
  File "/usr/local/lib/python3.5/dist-packages/luigi/worker.py", line 203, in run
    new_deps = self._run_get_new_deps()
  File "/usr/local/lib/python3.5/dist-packages/luigi/worker.py", line 140, in _run_get_new_deps
    task_gen = self.task.run()
  File "/bigmetadata/tasks/base_tasks.py", line 435, in run
    self.download()
  File "/bigmetadata/tasks/uk/cdrc.py", line 63, in download
    copyfile(self.input().path, '{output}.zip'.format(output=self.output().path))
  File "/bigmetadata/tasks/util.py", line 439, in copyfile
    shutil.copyfile(src, dst)
  File "/usr/lib/python3.5/shutil.py", line 114, in copyfile
    with open(src, 'rb') as fsrc:
FileNotFoundError: [Errno 2] No such file or directory: 'repository/tasks.uk.cdrc.DownloadOutputAreas__99914b932b/1/4d38b5b5-ab89-4b41-9d5a-60f7b3ee0ad1'

It happens because the reference to the file at the DB hasn't been deleted. It should fail with a more meaningful message (or just delete the entry and download again).