clinicalml / TabLLM

MIT License
265 stars 42 forks source link

dev_scores.json is not found when num_shot is 0 #19

Closed EdgedSquirrels closed 7 months ago

EdgedSquirrels commented 7 months ago

Hi,

I tried to reproduce the result on the zero-shot scenario. However, as num_shot is set to be 0 in few-shot-pretrained-100k.sh. I cannot see dev_scores.json in exp_out, but I can see dev_scores.json when num_shot is 4. Did I misconfigure anything or misunderstand the usage?

  # For zero-shot set to '0', for all to 'all'
  for num_shot in 4 8 16 32 64 128 256 512
  do
    ...
stefanhgm commented 7 months ago

Hello @EdgedSquirrels ,

thanks for using our code and for reaching out with this issue.

Did you get any error message our additional output when running it with 0? This would help me to understand the error better.

Thank you!

EdgedSquirrels commented 7 months ago

Hi @stefanhgm ,

Thanks for your time commitment to help out.

It complained Missing logger folder: exp_out/t03b_income_numshot0_seed42_ia3_pretrained100k/log and warned about loading cache. However, the messages will pop out when running num_shot=4 as well. It seems that no additional output is found when running num_shot=0.

Here is my output when running num_shot=0

Start experiment t03b_income_numshot0_seed42_ia3_pretrained100k
{
    "exp_dir": "exp_out/t03b_income_numshot0_seed42_ia3_pretrained100k",
    "exp_name": "t03b_income_numshot0_seed42_ia3_pretrained100k",
    "allow_skip_exp": true,
    "seed": 42,
    "model": "EncDec",
    "max_seq_len": 1024,
    "origin_model": "bigscience/T0_3B",
    "load_weight": "pretrained_checkpoints/t03b_ia3_finish.pt",
    "dataset": "income",
    "few_shot": true,
    "num_shot": 0,
    "few_shot_random_seed": 42,
    "train_template_idx": -1,
    "eval_template_idx": -1,
    "batch_size": 4,
    "eval_batch_size": 16,
    "num_workers": 8,
    "change_hswag_templates": false,
    "raft_cross_validation": true,
    "raft_validation_start": 0,
    "raft_labels_in_input_string": "comma",
    "cleaned_answer_choices_b77": false,
    "compute_precision": "bf16",
    "compute_strategy": "none",
    "num_steps": 0,
    "eval_epoch_interval": 30,
    "eval_before_training": false,
    "save_model": true,
    "save_step_interval": 20000,
    "mc_loss": 1,
    "unlikely_loss": 1,
    "length_norm": 1,
    "grad_accum_factor": 1,
    "split_option_at_inference": false,
    "optimizer": "adafactor",
    "lr": 0.003,
    "trainable_param_names": ".*lora_b.*",
    "scheduler": "linear_decay_with_warmup",
    "warmup_ratio": 0.06,
    "weight_decay": 0,
    "scale_parameter": true,
    "grad_clip_norm": 1,
    "model_modifier": "lora",
    "prompt_tuning_num_prefix_emb": 100,
    "prompt_tuning_encoder": true,
    "prompt_tuning_decoder": true,
    "lora_rank": 0,
    "lora_scaling_rank": 1,
    "lora_init_scale": 0.0,
    "lora_modules": ".*SelfAttention|.*EncDecAttention|.*DenseReluDense",
    "lora_layers": "k|v|wi_1.*",
    "bitfit_modules": ".*",
    "bitfit_layers": "q|k|v|o|wi_[01]|w_o",
    "adapter_type": "normal",
    "adapter_non_linearity": "relu",
    "adapter_reduction_factor": 4,
    "normal_adapter_residual": true,
    "lowrank_adapter_w_init": "glorot-uniform",
    "lowrank_adapter_rank": 1,
    "compacter_hypercomplex_division": 8,
    "compacter_learn_phm": true,
    "compacter_hypercomplex_nonlinearity": "glorot-uniform",
    "compacter_shared_phm_rule": false,
    "compacter_factorized_phm": false,
    "compacter_shared_W_phm": false,
    "compacter_factorized_phm_rule": false,
    "compacter_phm_c_init": "normal",
    "compacter_phm_rank": 1,
    "compacter_phm_init_range": 0.01,
    "compacter_kronecker_prod": false,
    "compacter_add_compacter_in_self_attention": false,
    "compacter_add_compacter_in_cross_attention": false,
    "intrinsic_projection": "fastfood",
    "intrinsic_said": true,
    "intrinsic_dim": 2000,
    "intrinsic_device": "cpu",
    "fishmask_mode": null,
    "fishmask_path": null,
    "fishmask_keep_ratio": 0.05,
    "prefix_tuning_num_input_tokens": 10,
    "prefix_tuning_num_target_tokens": 10,
    "prefix_tuning_init_path": null,
    "prefix_tuning_init_text": null,
    "prefix_tuning_parameterization": "mlp-512",
    "train_pred_file": "exp_out/t03b_income_numshot0_seed42_ia3_pretrained100k/train_pred.txt",
    "dev_pred_file": "exp_out/t03b_income_numshot0_seed42_ia3_pretrained100k/dev_pred.txt",
    "dev_score_file": "exp_out/t03b_income_numshot0_seed42_ia3_pretrained100k/dev_scores.json",
    "test_pred_file": "exp_out/t03b_income_numshot0_seed42_ia3_pretrained100k/test_pred.txt",
    "test_score_file": "exp_out/t03b_income_numshot0_seed42_ia3_pretrained100k/test_scores.json",
    "finish_flag_file": "exp_out/t03b_income_numshot0_seed42_ia3_pretrained100k/exp_completed.txt"
}
Mark experiment t03b_income_numshot0_seed42_ia3_pretrained100k as claimed
WARNING:root:Tried instantiating `DatasetTemplates` for income 50000_dollars, but no prompts found. Please ignore this warning if you are creating new prompts for this dataset.
Using bfloat16 Automatic Mixed Precision (AMP)
GPU available: True, used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
WARNING:datasets.arrow_dataset:Loading cached split indices for dataset at /data1/home/TabLLM/datasets_serialized/income/cache-7bc2ada03d2d63b7.arrow and /data1/home/TabLLM/datasets_serialized/income/cache-cad0987f77091155.arrow
WARNING:datasets.arrow_dataset:Loading cached split indices for dataset at /data1/home/TabLLM/datasets_serialized/income/cache-187e44e1ae55a233.arrow and /data1/home/TabLLM/datasets_serialized/income/cache-4ab888c823307cb7.arrow
WARNING:datasets.arrow_dataset:Loading cached processed dataset at /data1/home/TabLLM/datasets_serialized/income/cache-3b2e6d08878f1239.arrow
Train size 0
Eval size 103
WARNING:datasets.arrow_dataset:Loading cached split indices for dataset at /data1/home/TabLLM/datasets_serialized/income/cache-7bc2ada03d2d63b7.arrow and /data1/home/TabLLM/datasets_serialized/income/cache-cad0987f77091155.arrow
WARNING:datasets.arrow_dataset:Loading cached split indices for dataset at /data1/home/TabLLM/datasets_serialized/income/cache-187e44e1ae55a233.arrow and /data1/home/TabLLM/datasets_serialized/income/cache-4ab888c823307cb7.arrow
Test size 0
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [4]
Missing logger folder: exp_out/t03b_income_numshot0_seed42_ia3_pretrained100k/log

  | Name  | Type                       | Params
-----------------------------------------------------
0 | model | T5ForConditionalGeneration | 2.9 B 
-----------------------------------------------------
540 K     Trainable params
2.9 B     Non-trainable params
2.9 B     Total params
11,402.764Total estimated model params size (MB)
stefanhgm commented 7 months ago

Hi @EdgedSquirrels ,

sorry for the late reply. It seems the missing /log is no issue (see for instance this t-few issue who also reported it: https://github.com/r-three/t-few/issues/15) and also the cache warning should not cause a problem.

Does the program just stop after this output, because I cannot really make out a definite error at this point?

Does the output for 4 shots look the same until the total estimated params size?

Best, Stefan

EdgedSquirrels commented 7 months ago

Hi @stefanhgm ,

The program just stops after this output when running 0 shots. The output looks the same for 4 shots, but it will successfully keep inferencing after those output.

Here is the additional output in 4 shots:

/data1/home/miniconda3/envs/tfew/lib/python3.7/site-packages/pytorch_lightning/trainer/data_loading.py:429: UserWarning: The number of training samples (1) is smaller than the logging interval Trainer(log_every_n_steps=4). Set a lower value for log_every_n_steps if you want to see logs for the training epoch.
  f"The number of training samples ({self.num_training_batches}) is smaller than the logging interval"
Epoch 29:  88%|███████████████████████████████████████████████████████████████         | 7/8 [00:02<00:00,  2.79it/s, loss=2.14, v_num=0]
{"AUC": 0.4852150537634409, "PR": 0.30854919126142144, "micro_f1": 0.5436893203883495, "macro_f1": 0.5013904624575136, "accuracy": 0.5436893203883495, "num": 103, "num_steps": -1, "score_gt": 0.48910640975804004, "score_cand": 0.5233835757357402}

Stored new best metric ['AUC'] with values [0.4852150537634409] at step 29.
Epoch 29: 100%|████████████████████████████████████████████████████████████████████████| 8/8 [00:03<00:00,  2.66it/s, loss=2.14, v_num=0]
stefanhgm commented 7 months ago

Hi @EdgedSquirrels ,

sorry for the late reply. It took me some more time to look into this.

Basically, there is another option you have to set in the code of the running script which I did not specify properly. In few-shot-pretrained-100k.sh you have to remove the comments for Zero-Shot and comment out the Few-Shot portion. Sorry for not making this clear in the readme like this:

      # Zero-shot
      eval_before_training=True
      num_steps=0
      # Few-shot
      # eval_before_training=False
      # num_steps=$(( 30 * ($num_shot / $train_batch_size)))
      # eval_epoch_interval=30

Please let me know if this works for you!

EdgedSquirrels commented 7 months ago

Hi @stefanhgm ,

It works for me. Thanks for your help!