This issue is due to missing data (and not a bug, though the error message could be clearer.). The user should download the data first (when using the birds dataset):
mlflow run . -e download -o dataset=birds
mlflow run . -e pretrain_damsm -o dataset=birds
Full Trace
Executing mlflow run . -e pretrain_damsm results in the following error:
2019/07/22 00:42:21 INFO mlflow.projects: === Creating conda environment mlflow-27cdc5cb4fc5102311c02a771638ea390626c7b3 ===
Collecting package metadata: done
Solving environment: done
==> WARNING: A newer version of conda exists. <==
current version: 4.6.14
latest version: 4.7.10
Please update conda by running
$ conda update -n base -c defaults conda
Downloading and Extracting Packages
scikit-learn-0.21.2 | 6.7 MB | : 100% 1.0/1 [00:02<00:00, 2.87s/it]
pip-19.1.1 | 1.8 MB | : 100% 1.0/1 [00:00<00:00, 1.07it/s]
wheel-0.33.4 | 34 KB | : 100% 1.0/1 [00:00<00:00, 13.47it/s]
pyparsing-2.4.0 | 55 KB | : 100% 1.0/1 [00:00<00:00, 10.42it/s]
Preparing transaction: done
Verifying transaction: done
Executing transaction: done
Ran pip subprocess with arguments:
['/root/miniconda3/envs/mlflow-27cdc5cb4fc5102311c02a771638ea390626c7b3/bin/python', '-m', 'pip', 'install', '-U', '-r', '/content/AttnGAN/condaenv.s6_12qfz.requirements.txt']
Pip subprocess output:
Collecting googledrivedownloader==0.4 (from -r /content/AttnGAN/condaenv.s6_12qfz.requirements.txt (line 1))
Downloading https://files.pythonhosted.org/packages/3a/5c/485e8724383b482cc6c739f3359991b8a93fb9316637af0ac954729545c9/googledrivedownloader-0.4-py2.py3-none-any.whl
Installing collected packages: googledrivedownloader
Successfully installed googledrivedownloader-0.4
#
# To activate this environment, use:
# > conda activate mlflow-27cdc5cb4fc5102311c02a771638ea390626c7b3
#
# To deactivate an active environment, use:
# > conda deactivate
#
2019/07/22 00:43:02 INFO mlflow.projects: === Created directory /tmp/tmp6j8i2d91 for downloading remote URIs passed to arguments of type 'path' ===
2019/07/22 00:43:02 INFO mlflow.projects: === Running command 'source /root/miniconda3/bin/activate mlflow-27cdc5cb4fc5102311c02a771638ea390626c7b3 && python src/pretrain_DAMSM.py --cfg cfg/birds/DAMSM/bird.yml --gpu 0' in run with ID '3f0ac52f50dd43a7bfdf8b06978e7958' ===
/content/AttnGAN/src/miscc/config.py:106: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
yaml_cfg = edict(yaml.load(f))
Using config:
{'B_VALIDATION': False,
'CONFIG_NAME': 'DAMSM',
'CUDA': True,
'DATASET_NAME': 'birds',
'DATA_DIR': 'data/birds',
'GAN': {'B_ATTENTION': True,
'B_DCGAN': False,
'CONDITION_DIM': 100,
'DF_DIM': 64,
'GF_DIM': 128,
'R_NUM': 2,
'Z_DIM': 100},
'GPU_ID': 0,
'RNN_TYPE': 'LSTM',
'TEXT': {'CAPTIONS_PER_IMAGE': 10, 'EMBEDDING_DIM': 256, 'WORDS_NUM': 18},
'TRAIN': {'BATCH_SIZE': 48,
'B_NET_D': True,
'DISCRIMINATOR_LR': 0.0002,
'ENCODER_LR': 0.002,
'FLAG': True,
'GENERATOR_LR': 0.0002,
'MAX_EPOCH': 600,
'NET_E': '',
'NET_G': '',
'RNN_GRAD_CLIP': 0.25,
'SMOOTH': {'GAMMA1': 4.0,
'GAMMA2': 5.0,
'GAMMA3': 10.0,
'LAMBDA': 1.0},
'SNAPSHOT_INTERVAL': 50},
'TREE': {'BASE_SIZE': 299, 'BRANCH_NUM': 1},
'WORKERS': 1}
Traceback (most recent call last):
File "src/pretrain_DAMSM.py", line 262, in <module>
cfg.DATA_DIR, "train", base_size=cfg.TREE.BASE_SIZE, transform=image_transform
File "/content/AttnGAN/src/datasets.py", line 116, in __init__
self.bbox = self.load_bbox()
File "/content/AttnGAN/src/datasets.py", line 132, in load_bbox
bbox_path, delim_whitespace=True, header=None
File "/root/miniconda3/envs/mlflow-27cdc5cb4fc5102311c02a771638ea390626c7b3/lib/python3.6/site-packages/pandas/io/parsers.py", line 685, in parser_f
return _read(filepath_or_buffer, kwds)
File "/root/miniconda3/envs/mlflow-27cdc5cb4fc5102311c02a771638ea390626c7b3/lib/python3.6/site-packages/pandas/io/parsers.py", line 457, in _read
parser = TextFileReader(fp_or_buf, **kwds)
File "/root/miniconda3/envs/mlflow-27cdc5cb4fc5102311c02a771638ea390626c7b3/lib/python3.6/site-packages/pandas/io/parsers.py", line 895, in __init__
self._make_engine(self.engine)
File "/root/miniconda3/envs/mlflow-27cdc5cb4fc5102311c02a771638ea390626c7b3/lib/python3.6/site-packages/pandas/io/parsers.py", line 1135, in _make_engine
self._engine = CParserWrapper(self.f, **self.options)
File "/root/miniconda3/envs/mlflow-27cdc5cb4fc5102311c02a771638ea390626c7b3/lib/python3.6/site-packages/pandas/io/parsers.py", line 1906, in __init__
self._reader = parsers.TextReader(src, **kwds)
File "pandas/_libs/parsers.pyx", line 380, in pandas._libs.parsers.TextReader.__cinit__
File "pandas/_libs/parsers.pyx", line 687, in pandas._libs.parsers.TextReader._setup_parser_source
FileNotFoundError: [Errno 2] File b'data/birds/CUB_200_2011/bounding_boxes.txt' does not exist: b'data/birds/CUB_200_2011/bounding_boxes.txt'
2019/07/22 00:43:07 ERROR mlflow.cli: === Run (ID '3f0ac52f50dd43a7bfdf8b06978e7958') failed ===
Resolution
This issue is due to missing data (and not a bug, though the error message could be clearer.). The user should download the data first (when using the
birds
dataset):Full Trace
Executing
mlflow run . -e pretrain_damsm
results in the following error: