yiren-jian / BLIText

[NeurIPS 2023] Bootstrapping Vision-Language Learning with Decoupled Language Pre-training
BSD 3-Clause "New" or "Revised" License
24 stars 1 forks source link

Stage 0 scripts and config #5

Open Sreyan88 opened 8 months ago

Sreyan88 commented 8 months ago

Hi there,

Great work! Could you please provide us with the pretrain_stage0.sh or the config file (except the log file). We would like to reproduce some experiments! Thank You!

yiren-jian commented 8 months ago

I used something similar to this (if you find anything here inconsistent with the log, please feel free to replace it). The stage-0 was trained on an other server at Northwestern with 3x RTX-A6000, which I only kept the log and pre-trained weights.

model:
  arch: pformer_opt
  model_type: pformer_opt2.7b
  load_pretrained: False
  # intialize stage 2 pretraining from stage 1 pretrained model
  # pretrained: "https://storage.googleapis.com/sfr-vision-language-research/LAVIS/models/BLIP2/blip2_pretrained.pth"
  freeze_vit: True

datasets:
  sentence_dataset:
    text_processor:
        train:
          name: "blip_caption"

run:
  task: image_text_pretrain   ### no need to change
  # runner: runner_iter
  # optimizer
  lr_sched: "linear_warmup_cosine_lr"
  init_lr: 1e-4
  min_lr: 1e-5
  warmup_lr: 1e-6

  weight_decay: 0.05
  max_epoch: 5
  # max_iters: 60000
  # iters_per_inner_epoch: 6000
  batch_size_train: 128
  batch_size_eval: 64
  num_workers: 4
  warmup_steps: 2000

  seed: 42
  output_dir: "output/BLIP-T/Pretrain_stage0"

  amp: True
  resume_ckpt_path: null

  evaluate: False
  train_splits: ["train"]

  device: "cuda"
  world_size: 3
  dist_url: "env://"
  distributed: True
jyizheng commented 3 weeks ago

I got the error below when using the yaml file your provided above.

(less4) root@ddab717805f9:/home/tiger/maas/BLIText# bash run_scripts/blip-T/train/pretrain_stage0.sh [125/399] WARNING:main:


Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance i n your application as needed.


builder_cls=None, dataset_name=sentence_dataset
Traceback (most recent call last):
File "/home/tiger/maas/BLIText/train.py", line 103, in
main()
File "/home/tiger/maas/BLIText/train.py", line 81, in main
cfg = Config(parse_args())
File "/home/tiger/maas/BLIText/lavis/common/config.py", line 31, in init
dataset_config = self.build_dataset_config(config)
File "/home/tiger/maas/BLIText/lavis/common/config.py", line 102, in build_dataset_config
dataset_config_path = builder_cls.default_config_path(
AttributeError: 'NoneType' object has no attribute 'default_config_path'
builder_cls=None, dataset_name=sentence_dataset
Traceback (most recent call last):
File "/home/tiger/maas/BLIText/train.py", line 103, in
main()
File "/home/tiger/maas/BLIText/train.py", line 81, in main
cfg = Config(parse_args())
File "/home/tiger/maas/BLIText/lavis/common/config.py", line 31, in init
dataset_config = self.build_dataset_config(config)
File "/home/tiger/maas/BLIText/lavis/common/config.py", line 102, in build_dataset_config
dataset_config_path = builder_cls.default_config_path(
AttributeError: 'NoneType' object has no attribute 'default_config_path'
builder_cls=None, dataset_name=sentence_dataset
Traceback (most recent call last):
File "/home/tiger/maas/BLIText/train.py", line 103, in
main()
File "/home/tiger/maas/BLIText/train.py", line 81, in main
cfg = Config(parse_args())
File "/home/tiger/maas/BLIText/lavis/common/config.py", line 31, in init
dataset_config = self.build_dataset_config(config)
File "/home/tiger/maas/BLIText/lavis/common/config.py", line 102, in build_dataset_config
dataset_config_path = builder_cls.default_config_path(
AttributeError: 'NoneType' object has no attribute 'default_config_path'
builder_cls=None, dataset_name=sentence_dataset
Traceback (most recent call last):
File "/home/tiger/maas/BLIText/train.py", line 103, in
main()
File "/home/tiger/maas/BLIText/train.py", line 81, in main
cfg = Config(parse_args())
File "/home/tiger/maas/BLIText/lavis/common/config.py", line 31, in init
dataset_config = self.build_dataset_config(config)
File "/home/tiger/maas/BLIText/lavis/common/config.py", line 102, in build_dataset_config dataset_config_path = builder_cls.default_config_path( AttributeError: 'NoneType' object has no attribute 'default_config_path'

yiren-jian commented 3 weeks ago

I got the error below when using the yaml file your provided above.

(less4) root@ddab717805f9:/home/tiger/maas/BLIText# bash run_scripts/blip-T/train/pretrain_stage0.sh [125/399] WARNING:main:

Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance i n your application as needed.

builder_cls=None, dataset_name=sentence_dataset Traceback (most recent call last): File "/home/tiger/maas/BLIText/train.py", line 103, in main() File "/home/tiger/maas/BLIText/train.py", line 81, in main cfg = Config(parse_args()) File "/home/tiger/maas/BLIText/lavis/common/config.py", line 31, in init dataset_config = self.build_dataset_config(config) File "/home/tiger/maas/BLIText/lavis/common/config.py", line 102, in build_dataset_config dataset_config_path = builder_cls.default_config_path( AttributeError: 'NoneType' object has no attribute 'default_config_path' builder_cls=None, dataset_name=sentence_dataset Traceback (most recent call last): File "/home/tiger/maas/BLIText/train.py", line 103, in main() File "/home/tiger/maas/BLIText/train.py", line 81, in main cfg = Config(parse_args()) File "/home/tiger/maas/BLIText/lavis/common/config.py", line 31, in init dataset_config = self.build_dataset_config(config) File "/home/tiger/maas/BLIText/lavis/common/config.py", line 102, in build_dataset_config dataset_config_path = builder_cls.default_config_path( AttributeError: 'NoneType' object has no attribute 'default_config_path' builder_cls=None, dataset_name=sentence_dataset Traceback (most recent call last): File "/home/tiger/maas/BLIText/train.py", line 103, in main() File "/home/tiger/maas/BLIText/train.py", line 81, in main cfg = Config(parse_args()) File "/home/tiger/maas/BLIText/lavis/common/config.py", line 31, in init dataset_config = self.build_dataset_config(config) File "/home/tiger/maas/BLIText/lavis/common/config.py", line 102, in build_dataset_config dataset_config_path = builder_cls.default_config_path( AttributeError: 'NoneType' object has no attribute 'default_config_path' builder_cls=None, dataset_name=sentence_dataset Traceback (most recent call last): File "/home/tiger/maas/BLIText/train.py", line 103, in main() File "/home/tiger/maas/BLIText/train.py", line 81, in main cfg = Config(parse_args()) File "/home/tiger/maas/BLIText/lavis/common/config.py", line 31, in init dataset_config = self.build_dataset_config(config) File "/home/tiger/maas/BLIText/lavis/common/config.py", line 102, in build_dataset_config dataset_config_path = builder_cls.default_config_path( AttributeError: 'NoneType' object has no attribute 'default_config_path'

The error indicates that builder_cls (for dataset) is None. sentence_dataset should be changed to laion_sentence_115m, which is to match the sentence dataset config defined here. You can also define your own sentence dataset similar to laion_sentence_115m.