microsoft / Oscar

Oscar and VinVL
MIT License
1.04k stars 251 forks source link

yaml file unacceptable character #170

Closed lisaliu1997 closed 2 years ago

lisaliu1997 commented 2 years ago

Hi, I'm trying to run Oscar on google colab and stored the dataset along with yaml files on google drive. I have things set up and ran the following command:

%%bash python Oscar/oscar/run_captioning.py \ --model_name_or_path Oscar/pretrained_models/base-vg-labels/ep_67_588997 \ --do_train \ --do_lower_case \ --evaluate_during_training \ --add_od_labels \ --learning_rate 0.00003 \ --per_gpu_train_batch_size 64 \ --num_train_epochs 30 \ --save_steps 5000 \ --output_dir Oscar/output/ \ --data_dir drive/MyDrive/path/to/dataset/ \ --train_yaml drive/MyDrive/path/to/train.yaml

It's complaining that the character in the file is unacceptable.

2021-11-23 20:13:35,342 vlpretrain WARNING: Device: cuda, n_gpu: 1 2021-11-23 20:13:42,312 vlpretrain INFO: Training/evaluation parameters Namespace(adam_epsilon=1e-08, add_od_labels=True, cider_cached_tokens='coco-train-words.p', config_name='', data_dir='drive/MyDrive/597F_Project/', device=device(type='cuda'), distributed=False, do_eval=False, do_lower_case=True, do_test=False, do_train=True, drop_out=0.1, drop_worst_after=0, drop_worst_ratio=0, eval_model_dir='', evaluate_during_training=True, freeze_embedding=False, gradient_accumulation_steps=1, img_feature_dim=2054, img_feature_type='frcnn', label_smoothing=0, learning_rate=3e-05, length_penalty=1, local_rank=0, logging_steps=20, loss_type='sfmx', mask_prob=0.15, max_gen_length=20, max_grad_norm=1.0, max_img_seq_length=50, max_masked_tokens=3, max_seq_a_length=40, max_seq_length=70, max_steps=-1, min_constraints_to_satisfy=2, model_name_or_path='Oscar/pretrained_models/base-vg-labels/ep_67_588997', no_cuda=False, num_beams=1, num_gpus=1, num_keep_best=1, num_labels=2, num_return_sequences=1, num_train_epochs=30, num_workers=4, output_dir='Oscar/output/', output_hidden_states=False, output_mode='classification', per_gpu_eval_batch_size=64, per_gpu_train_batch_size=64, repetition_penalty=1, save_steps=5000, sc_baseline_type='greedy', sc_beam_size=1, sc_train_sample_n=5, scheduler='linear', scst=False, seed=88, temperature=1, test_yaml='test.yaml', tie_weights=False, tokenizer_name='', top_k=0, top_p=1, train_yaml='drive/MyDrive/597F_Project/coco_caption/train.yaml', use_cbs=False, val_yaml='val.yaml', warmup_steps=0, weight_decay=0.05) /content/Oscar/oscar/utils/misc.py:34: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details. return yaml.load(fp) Traceback (most recent call last): File "Oscar/oscar/run_captioning.py", line 1009, in main() File "Oscar/oscar/run_captioning.py", line 979, in main args.distributed, is_train=True) File "Oscar/oscar/run_captioning.py", line 360, in make_data_loader is_train=(is_train and not args.scst)) File "Oscar/oscar/run_captioning.py", line 336, in build_dataset is_train=True, mask_prob=args.mask_prob, max_masked_tokens=args.max_masked_tokens) File "Oscar/oscar/run_captioning.py", line 47, in init self.cfg = load_from_yaml_file(yaml_file) File "/content/Oscar/oscar/utils/misc.py", line 34, in load_from_yaml_file return yaml.load(fp) File "/usr/local/lib/python3.7/site-packages/yaml/init.py", line 112, in load loader = Loader(stream) File "/usr/local/lib/python3.7/site-packages/yaml/loader.py", line 24, in init Reader.init(self, stream) File "/usr/local/lib/python3.7/site-packages/yaml/reader.py", line 85, in init self.determine_encoding() File "/usr/local/lib/python3.7/site-packages/yaml/reader.py", line 135, in determine_encoding self.update(1) File "/usr/local/lib/python3.7/site-packages/yaml/reader.py", line 169, in update self.check_printable(data) File "/usr/local/lib/python3.7/site-packages/yaml/reader.py", line 144, in check_printable 'unicode', "special characters are not allowed") yaml.reader.ReaderError: unacceptable character #x0000: special characters are not allowed in "drive/MyDrive/597F_Project/coco_caption/train.yaml", position 0

Can anyone help with this please? Thanks!

navba-MSFT commented 2 years ago

@lisaliu1997 Could you please let us know what was the fix for this issue?