Closed paniquex closed 4 years ago
Hi,
Sorry for the problem. There is an issue of parsing the settings at the moment. It will be solved during this week.
Hi,
The issues are most likely resolved.
Please, re-run the whole pipeline (i.e. starting from dataset creation) and let me know if there are any problems that I did not detect in my tests.
Hello,
In the first case error was resolved, but now I am waiting approximately for an hour at this phase:
INFO | [23:13:21] processes.dataset -- Checking the development split
Is it ok?
In the second case (python processes/dataset.py) I get same error:
~/Documents/Automated_Audio_Captioning_DCASE2020/dcase-2020-baseline/processes/dataset.py in <module>
173
174 if __name__ == '__main__':
--> 175 main()
176
177 # EOF
~/Documents/Automated_Audio_Captioning_DCASE2020/dcase-2020-baseline/processes/dataset.py in main()
151
152 init_loggers(verbose=verbose,
--> 153 settings=settings['dirs_and_files']['logging'])
154
155 logger_main = logger.bind(is_caption=False, indent=1)
~/Documents/Automated_Audio_Captioning_DCASE2020/dcase-2020-baseline/tools/printing.py in init_loggers(verbose, settings)
59 record['extra']['indent'] == i and not record['extra']['is_caption'])
60
---> 61 logging_path = Path(settings['root_dir'],
62 settings['logger_dir'])
63
KeyError: 'root_dir'```
About first case: I've waited a while and got this error:
INFO | [00:42:27] processes.method -- Getting training data
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
~/Documents/Automated_Audio_Captioning_DCASE2020/dcase-2020-baseline/main.py in <module>
56
57 if __name__ == '__main__':
---> 58 main()
59
60 # EOF
~/Documents/Automated_Audio_Captioning_DCASE2020/dcase-2020-baseline/main.py in main()
52 if settings['workflow']['dnn_training'] or \
53 settings['workflow']['dnn_evaluation']:
---> 54 method.method(settings)
55
56
~/Documents/Automated_Audio_Captioning_DCASE2020/dcase-2020-baseline/processes/method.py in method(settings)
426 model_file_name=model_file_name,
427 model_dir=model_dir,
--> 428 indices_list=indices_list)
429 logger_inner.info('Training done')
430
~/Documents/Automated_Audio_Captioning_DCASE2020/dcase-2020-baseline/processes/method.py in _do_training(model, settings_training, settings_data, settings_io, model_file_name, model_dir, indices_list)
226 is_training=True,
227 settings_data=settings_training,
--> 228 settings_io=settings_io)
229
230 logger_main.info('Done')
~/Documents/Automated_Audio_Captioning_DCASE2020/dcase-2020-baseline/data_handlers/clotho_loader.py in get_clotho_loader(split, is_training, settings_data, settings_io)
87 data_dir=data_dir,
88 split=split,
---> 89 input_field_name=settings_data['input_field_name'],
90 output_field_name=settings_data['output_field_name'],
91 load_into_memory=settings_data['load_into_memory'])
KeyError: 'input_field_name'
Hi,
Waiting that much is typical, if you choose to validate the creation of the dataset. In the file settings/dataset_creation.yaml
(line 9) you can choose to validate or not the created examples. This process takes quite a long time. I guess I can make it by default to not be performed.
I will push today a fix on that, so by default you will not have to wait that much (edit: changed already pushed).
Regarding the problem(s) that you got, they are on the process of training the DNN. So, there is no need for you to re-create the dataset. You can indicate whether you want or not to creat the dataset at the file settings/main_settings.yaml
(line 9).
So, at your case, you can safely set
workflow:
dataset_creation: No
I will check the issues that you encountered today and push a fix. Thank you for your time.
Hi,
Now the above are resolved.
Please let me know if you encounter any other bugs.
I will leave this issue open in case that there is something more unresolved.
Hi,
Thank you for fixes! Everything works well before "text_output_every_nb_epochs"-th epoch in training loop. After "text_output_every_nb_epochs"-th epoch I get this error:
(I've changed text_output_every_nb_epochs
setting in method_baseline.yaml to !!int 1
to reproduce error faster, but I also get this error if text_output_every_nb_epochs == !!int 10
)
INFO | [20:10:24] processes.method -- Epoch: 00000 -- Training loss: 4.1841 | Time: 467.728
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
~/Documents/Automated_Audio_Captioning_DCASE2020/dcase-2020-baseline/main.py in <module>
56
57 if name == 'main':
---> 58 main()
59
60 # EOF
~/Documents/Automated_Audio_Captioning_DCASE2020/dcase-2020-baseline/main.py in main()
52 if settings['workflow']['dnn_training'] or \
53 settings['workflow']['dnn_evaluation']:
---> 54 method.method(settings)
55
56
~/Documents/Automated_Audio_Captioning_DCASE2020/dcase-2020-baseline/processes/method.py in method(settings)
429 model_file_name=model_file_name,
430 model_dir=model_dir,
--> 431 indices_list=indices_list)
432 logger_inner.info('Training done')
433
~/Documents/Automated_Audio_Captioning_DCASE2020/dcase-2020-baseline/processes/method.py in _do_training(model, settings_training, settings_data, settings_io, model_file_name, model_dir, indices_list)
279 file_names=[f_names[i_f_name]
280 for i_f_name in sampling_indices],
--> 281 eos_token=settings_data['eos_token'],
282 print_to_console=False)
283
KeyError: 'eos_token'
Ah! Forgotten query to the settings file!
Will push the fix today the soonest.
Fixed.
Please, let me know if it worked.
Hi
Unfortunately, I got a new error:
INFO | [14:51:42] processes.method -- Epoch: 00000 -- Training loss: 4.1357 | Time: 263.318
INFO | [14:51:42] processes.method -- Starting decoding of captions
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
~/Documents/Automated_Audio_Captioning_DCASE2020/dcase-2020-baseline/main.py in <module>
56
57 if __name__ == '__main__':
---> 58 main()
59
60 # EOF
~/Documents/Automated_Audio_Captioning_DCASE2020/dcase-2020-baseline/main.py in main()
52 if settings['workflow']['dnn_training'] or \
53 settings['workflow']['dnn_evaluation']:
---> 54 method.method(settings)
55
56
~/Documents/Automated_Audio_Captioning_DCASE2020/dcase-2020-baseline/processes/method.py in method(settings)
429 model_file_name=model_file_name,
430 model_dir=model_dir,
--> 431 indices_list=indices_list)
432 logger_inner.info('Training done')
433
~/Documents/Automated_Audio_Captioning_DCASE2020/dcase-2020-baseline/processes/method.py in _do_training(model, settings_training, settings_data, settings_io, model_file_name, model_dir, indices_list)
280 for i_f_name in sampling_indices],
281 eos_token='<eos>',
--> 282 print_to_console=False)
283
284 # Check improvement of loss
~/Documents/Automated_Audio_Captioning_DCASE2020/dcase-2020-baseline/processes/method.py in _decode_outputs(predicted_outputs, ground_truth_outputs, indices_object, file_names, eos_token, print_to_console)
84 gt_caption = ' '.join(gt_caption)
85
---> 86 f_n = f_name.stem.split('.')[0]
87
88 if f_n not in f_names:
AttributeError: 'numpy.str_' object has no attribute 'stem'
Hm.
I did not had any of these errors and I run the process multiple times. Can you please let me know what exactly you did?
One thing that I can imagine is that your directories under data
are not correct.
Please try re-creating the dataset (without validating it).
The f_name
should not be type of numpy.
but a pathlib.Path
.
Found. Fixed. Thnx!
Let me know if find any other bug.
I will let this issue open for couple of days more.
Thank you for fixes! Everything works well in training loop, but in evaluation stage I got this error:
---------------------------------------------------------------------------
FileNotFoundError Traceback (most recent call last)
~/Documents/Automated_Audio_Captioning_DCASE2020/dcase-2020-baseline/main.py in <module>
56
57 if __name__ == '__main__':
---> 58 main()
59
60 # EOF
~/Documents/Automated_Audio_Captioning_DCASE2020/dcase-2020-baseline/main.py in main()
52 if settings['workflow']['dnn_training'] or \
53 settings['workflow']['dnn_evaluation']:
---> 54 method.method(settings)
55
56
~/Documents/Automated_Audio_Captioning_DCASE2020/dcase-2020-baseline/processes/method.py in method(settings)
451 settings_data=settings['dnn_training_settings']['data'],
452 settings_io=settings['dirs_and_files'],
--> 453 indices_list=indices_list)
454 logger_inner.info('Evaluation done')
455
~/Documents/Automated_Audio_Captioning_DCASE2020/dcase-2020-baseline/processes/method.py in _do_evaluation(model, settings_data, settings_io, indices_list)
177 logger_main.info('Evaluation done')
178
--> 179 metrics = evaluate_metrics(captions_pred, captions_gt)
180
181 for metric, values in metrics.items():
~/Documents/Automated_Audio_Captioning_DCASE2020/dcase-2020-baseline/eval_metrics.py in evaluate_metrics(prediction_file, reference_file, nb_reference_captions)
287 ground_truths.append([reference_dict[file_name][cap] for cap in cap_names])
288
--> 289 metrics, per_file_metrics = evaluate_metrics_from_lists(predictions, ground_truths)
290
291 total_metrics = combine_single_and_per_file_metrics(
~/Documents/Automated_Audio_Captioning_DCASE2020/dcase-2020-baseline/eval_metrics.py in evaluate_metrics_from_lists(predictions, ground_truths, ids)
159 write_json(pred, pred_file)
160
--> 161 metrics, per_file_metrics = evaluate_metrics_from_files(pred_file, ref_file)
162
163 # Delete temporary files
~/Documents/Automated_Audio_Captioning_DCASE2020/dcase-2020-baseline/eval_metrics.py in evaluate_metrics_from_files(pred_file, ref_file)
108 cocoEval = COCOEvalCap(coco, cocoRes)
109 cocoEval.params['audio_id'] = cocoRes.getAudioIds()
--> 110 cocoEval.evaluate()
111
112 # Make dict from metrics
~/Documents/Automated_Audio_Captioning_DCASE2020/dcase-2020-baseline/coco_caption/pycocoevalcap/eval.py in evaluate(self, verbose)
38 print('tokenization...')
39 tokenizer = PTBTokenizer()
---> 40 gts = tokenizer.tokenize(gts)
41 res = tokenizer.tokenize(res)
42
~/Documents/Automated_Audio_Captioning_DCASE2020/dcase-2020-baseline/coco_caption/pycocoevalcap/tokenizer/ptbtokenizer.py in tokenize(self, captions_for_audio)
56 cmd.append(os.path.basename(tmp_file.name))
57 p_tokenizer = subprocess.Popen(cmd, cwd=path_to_jar_dirname, \
---> 58 stdout=subprocess.PIPE)
59 token_lines = p_tokenizer.communicate(input=sentences.rstrip())[0]
60 token_lines = token_lines.decode()
~/anaconda3/envs/audio-captioning-baseline/lib/python3.7/subprocess.py in __init__(self, args, bufsize, executable, stdin, stdout, stderr, preexec_fn, close_fds, shell, cwd, env, universal_newlines, startupinfo, creationflags, restore_signals, start_new_session, pass_fds, encoding, errors, text)
754 c2pread, c2pwrite,
755 errread, errwrite,
--> 756 restore_signals, start_new_session)
757 except:
758 # Cleanup if the child failed starting.
~/anaconda3/envs/audio-captioning-baseline/lib/python3.7/subprocess.py in _execute_child(self, args, executable, preexec_fn, close_fds, pass_fds, cwd, env, startupinfo, creationflags, shell, p2cread, p2cwrite, c2pread, c2pwrite, errread, errwrite, restore_signals, start_new_session)
1497 if errno_num == errno.ENOENT:
1498 err_msg += ': ' + repr(err_filename)
-> 1499 raise child_exception_type(errno_num, err_msg, err_filename)
1500 raise child_exception_type(err_msg)
1501
FileNotFoundError: [Errno 2] No such file or directory: 'java': 'java'
Hi,
Thank you for your time.
Unfortunately, this error is not associated with the code of the baseline system. If you observe, the problem that you have initiates from the coco-caption tools. Basically, the problem that you have relates to the prerequisites that the calculation of SPICE has. Though, I definitely would try to help you.
Thus, would you be OK if I close this issue and you open another one about this last problem that you have?
This way would be easier for others to find solutions if they encounter similar problems.
Hi,
Thank you for help. Yes, you could close this issue and open another one.
I'm closing this. Please open a second issue about the last problem :)
Feel free to post again here (or open a new one) if you encounter similar problems!
When I try to run main.py from root directory with code
python main.py
, I get this error:Also when I try to run
python processes/dataset.py
, I get this error:Could you tell me, please, how to solve this problem. Thanks!