Closed lisaliu1997 closed 2 years ago
Hi, I'm trying to run Oscar on google colab and stored the dataset along with yaml files on google drive. I have things set up and ran the following command:
%%bash python Oscar/oscar/run_captioning.py \ --model_name_or_path Oscar/pretrained_models/base-vg-labels/ep_67_588997 \ --do_train \ --do_lower_case \ --evaluate_during_training \ --add_od_labels \ --learning_rate 0.00003 \ --per_gpu_train_batch_size 64 \ --num_train_epochs 30 \ --save_steps 5000 \ --output_dir Oscar/output/ \ --data_dir drive/MyDrive/path/to/dataset/ \ --train_yaml drive/MyDrive/path/to/train.yaml
It's complaining that the character in the file is unacceptable.
2021-11-23 20:13:35,342 vlpretrain WARNING: Device: cuda, n_gpu: 1 2021-11-23 20:13:42,312 vlpretrain INFO: Training/evaluation parameters Namespace(adam_epsilon=1e-08, add_od_labels=True, cider_cached_tokens='coco-train-words.p', config_name='', data_dir='drive/MyDrive/597F_Project/', device=device(type='cuda'), distributed=False, do_eval=False, do_lower_case=True, do_test=False, do_train=True, drop_out=0.1, drop_worst_after=0, drop_worst_ratio=0, eval_model_dir='', evaluate_during_training=True, freeze_embedding=False, gradient_accumulation_steps=1, img_feature_dim=2054, img_feature_type='frcnn', label_smoothing=0, learning_rate=3e-05, length_penalty=1, local_rank=0, logging_steps=20, loss_type='sfmx', mask_prob=0.15, max_gen_length=20, max_grad_norm=1.0, max_img_seq_length=50, max_masked_tokens=3, max_seq_a_length=40, max_seq_length=70, max_steps=-1, min_constraints_to_satisfy=2, model_name_or_path='Oscar/pretrained_models/base-vg-labels/ep_67_588997', no_cuda=False, num_beams=1, num_gpus=1, num_keep_best=1, num_labels=2, num_return_sequences=1, num_train_epochs=30, num_workers=4, output_dir='Oscar/output/', output_hidden_states=False, output_mode='classification', per_gpu_eval_batch_size=64, per_gpu_train_batch_size=64, repetition_penalty=1, save_steps=5000, sc_baseline_type='greedy', sc_beam_size=1, sc_train_sample_n=5, scheduler='linear', scst=False, seed=88, temperature=1, test_yaml='test.yaml', tie_weights=False, tokenizer_name='', top_k=0, top_p=1, train_yaml='drive/MyDrive/597F_Project/coco_caption/train.yaml', use_cbs=False, val_yaml='val.yaml', warmup_steps=0, weight_decay=0.05) /content/Oscar/oscar/utils/misc.py:34: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details. return yaml.load(fp) Traceback (most recent call last): File "Oscar/oscar/run_captioning.py", line 1009, in main() File "Oscar/oscar/run_captioning.py", line 979, in main args.distributed, is_train=True) File "Oscar/oscar/run_captioning.py", line 360, in make_data_loader is_train=(is_train and not args.scst)) File "Oscar/oscar/run_captioning.py", line 336, in build_dataset is_train=True, mask_prob=args.mask_prob, max_masked_tokens=args.max_masked_tokens) File "Oscar/oscar/run_captioning.py", line 47, in init self.cfg = load_from_yaml_file(yaml_file) File "/content/Oscar/oscar/utils/misc.py", line 34, in load_from_yaml_file return yaml.load(fp) File "/usr/local/lib/python3.7/site-packages/yaml/init.py", line 112, in load loader = Loader(stream) File "/usr/local/lib/python3.7/site-packages/yaml/loader.py", line 24, in init Reader.init(self, stream) File "/usr/local/lib/python3.7/site-packages/yaml/reader.py", line 85, in init self.determine_encoding() File "/usr/local/lib/python3.7/site-packages/yaml/reader.py", line 135, in determine_encoding self.update(1) File "/usr/local/lib/python3.7/site-packages/yaml/reader.py", line 169, in update self.check_printable(data) File "/usr/local/lib/python3.7/site-packages/yaml/reader.py", line 144, in check_printable 'unicode', "special characters are not allowed") yaml.reader.ReaderError: unacceptable character #x0000: special characters are not allowed in "drive/MyDrive/597F_Project/coco_caption/train.yaml", position 0
Can anyone help with this please? Thanks!
@lisaliu1997 Could you please let us know what was the fix for this issue?
Hi, I'm trying to run Oscar on google colab and stored the dataset along with yaml files on google drive. I have things set up and ran the following command:
%%bash python Oscar/oscar/run_captioning.py \ --model_name_or_path Oscar/pretrained_models/base-vg-labels/ep_67_588997 \ --do_train \ --do_lower_case \ --evaluate_during_training \ --add_od_labels \ --learning_rate 0.00003 \ --per_gpu_train_batch_size 64 \ --num_train_epochs 30 \ --save_steps 5000 \ --output_dir Oscar/output/ \ --data_dir drive/MyDrive/path/to/dataset/ \ --train_yaml drive/MyDrive/path/to/train.yaml
It's complaining that the character in the file is unacceptable.
2021-11-23 20:13:35,342 vlpretrain WARNING: Device: cuda, n_gpu: 1 2021-11-23 20:13:42,312 vlpretrain INFO: Training/evaluation parameters Namespace(adam_epsilon=1e-08, add_od_labels=True, cider_cached_tokens='coco-train-words.p', config_name='', data_dir='drive/MyDrive/597F_Project/', device=device(type='cuda'), distributed=False, do_eval=False, do_lower_case=True, do_test=False, do_train=True, drop_out=0.1, drop_worst_after=0, drop_worst_ratio=0, eval_model_dir='', evaluate_during_training=True, freeze_embedding=False, gradient_accumulation_steps=1, img_feature_dim=2054, img_feature_type='frcnn', label_smoothing=0, learning_rate=3e-05, length_penalty=1, local_rank=0, logging_steps=20, loss_type='sfmx', mask_prob=0.15, max_gen_length=20, max_grad_norm=1.0, max_img_seq_length=50, max_masked_tokens=3, max_seq_a_length=40, max_seq_length=70, max_steps=-1, min_constraints_to_satisfy=2, model_name_or_path='Oscar/pretrained_models/base-vg-labels/ep_67_588997', no_cuda=False, num_beams=1, num_gpus=1, num_keep_best=1, num_labels=2, num_return_sequences=1, num_train_epochs=30, num_workers=4, output_dir='Oscar/output/', output_hidden_states=False, output_mode='classification', per_gpu_eval_batch_size=64, per_gpu_train_batch_size=64, repetition_penalty=1, save_steps=5000, sc_baseline_type='greedy', sc_beam_size=1, sc_train_sample_n=5, scheduler='linear', scst=False, seed=88, temperature=1, test_yaml='test.yaml', tie_weights=False, tokenizer_name='', top_k=0, top_p=1, train_yaml='drive/MyDrive/597F_Project/coco_caption/train.yaml', use_cbs=False, val_yaml='val.yaml', warmup_steps=0, weight_decay=0.05) /content/Oscar/oscar/utils/misc.py:34: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details. return yaml.load(fp) Traceback (most recent call last): File "Oscar/oscar/run_captioning.py", line 1009, in
main()
File "Oscar/oscar/run_captioning.py", line 979, in main
args.distributed, is_train=True)
File "Oscar/oscar/run_captioning.py", line 360, in make_data_loader
is_train=(is_train and not args.scst))
File "Oscar/oscar/run_captioning.py", line 336, in build_dataset
is_train=True, mask_prob=args.mask_prob, max_masked_tokens=args.max_masked_tokens)
File "Oscar/oscar/run_captioning.py", line 47, in init
self.cfg = load_from_yaml_file(yaml_file)
File "/content/Oscar/oscar/utils/misc.py", line 34, in load_from_yaml_file
return yaml.load(fp)
File "/usr/local/lib/python3.7/site-packages/yaml/init.py", line 112, in load
loader = Loader(stream)
File "/usr/local/lib/python3.7/site-packages/yaml/loader.py", line 24, in init
Reader.init(self, stream)
File "/usr/local/lib/python3.7/site-packages/yaml/reader.py", line 85, in init
self.determine_encoding()
File "/usr/local/lib/python3.7/site-packages/yaml/reader.py", line 135, in determine_encoding
self.update(1)
File "/usr/local/lib/python3.7/site-packages/yaml/reader.py", line 169, in update
self.check_printable(data)
File "/usr/local/lib/python3.7/site-packages/yaml/reader.py", line 144, in check_printable
'unicode', "special characters are not allowed")
yaml.reader.ReaderError: unacceptable character #x0000: special characters are not allowed
in "drive/MyDrive/597F_Project/coco_caption/train.yaml", position 0
Can anyone help with this please? Thanks!