Closed monkeycc closed 3 years ago
Hi @monkeycc ,
You need to setup the corresponding datasets following datasets before training UPDOWN model.
For MSCOCO dataset, please down the bottom up features from https://github.com/peteanderson80/bottom-up-attention and convert them into npz format with tools/create_feats.py.
Best, Jianjie
python train_net.py --num-gpus 1 --config-file configs/image_caption/updown/updown.yaml SCORER: CIDER_CACHED: ../open_source_dataset/mscoco_dataset/mscoco_train_cider.pkl EOS_ID: 0 GT_PATH: ../open_source_dataset/mscoco_dataset/mscoco_train_gts.pkl NAME: BaseScorer TYPES: ['Cider'] WEIGHTS: [1.0] SEED: -1 SOLVER: ALPHA: 0.99 AMSGRAD: False BASE_LR: 0.0005 BETAS: [0.9, 0.999] BIAS_LR_FACTOR: 1.0 CENTERED: False CHECKPOINT_PERIOD: 1 DAMPENING: 0.0 EPOCH: 30 EPS: 1e-08 EVAL_PERIOD: 1 GRAD_CLIP: 0.1 GRAD_CLIP_TYPE: value INITIAL_ACCUMULATOR_VALUE: 0.0 LR_DECAY: 0.0 MOMENTUM: 0.9 NAME: Adam NESTEROV: 0.0 NORM_TYPE: 2.0 WEIGHT_DECAY: 0.0 WEIGHT_DECAY_BIAS: 0.0 WEIGHT_DECAY_NORM: 0.0 WRITE_PERIOD: 20 VERSION: 1 [09/27 19:21:02 xmodaler]: Full config saved to ./output\config.yaml [09/27 19:21:02 xl.utils.env]: Using a generated random seed 2862719 [09/27 19:21:04 xl.engine.defaults]: Model: RnnAttEncoderDecoder( (token_embed): TokenBaseEmbedding( (embeddings): Embedding(10200, 1024) (embeddings_act): ReLU() (embeddings_dropout): Dropout(p=0.5, inplace=False) ) (visual_embed): VisualBaseEmbedding( (embeddings): Linear(in_features=2048, out_features=1024, bias=True) (embeddings_act): ReLU() (embeddings_dropout): Dropout(p=0.5, inplace=False) ) (encoder): UpDownEncoder() (decoder): UpDownDecoder( (lstm1): LSTMCell(3072, 1024) (lstm2): LSTMCell(2048, 1024) (att): BaseAttention( (w_h): Linear(in_features=1024, out_features=512, bias=False) (act): Tanh() (w_alpha): Linear(in_features=512, out_features=1, bias=False) (softmax): Softmax(dim=-1) ) (p_att_feats): Linear(in_features=1024, out_features=512, bias=True) ) (predictor): BasePredictor( (logits): Linear(in_features=1024, out_features=10200, bias=True) (dropout): Dropout(p=0.5, inplace=False) ) (greedy_decoder): GreedyDecoder() (beam_searcher): BeamSearcher() ) [09/27 19:21:05 xl.datasets.common]: Serializing 113287 elements to byte tensors and concatenating them all ... [09/27 19:21:06 xl.datasets.common]: Serialized dataset takes 115.74 MiB [09/27 19:21:06 xl.datasets.common]: Serializing 5000 elements to byte tensors and concatenating them all ... [09/27 19:21:06 xl.datasets.common]: Serialized dataset takes 0.17 MiB [09/27 19:21:06 xl.datasets.common]: Serializing 5000 elements to byte tensors and concatenating them all ... [09/27 19:21:06 xl.datasets.common]: Serialized dataset takes 0.17 MiB loading annotations into memory... Done (t=0.06s) creating index... index created! loading annotations into memory... Done (t=0.07s) creating index... index created! [09/27 19:21:16 fvcore.common.checkpoint]: No checkpoint found. Initializing model from scratch [09/27 19:21:16 xl.engine.train_loop]: Starting training from iteration 0 ERROR [09/27 19:21:16 xl.engine.train_loop]: Exception during training: Traceback (most recent call last): File "D:\xmodaler\xmodaler\engine\train_loop.py", line 151, in train self.run_step() File "D:\xmodaler\xmodaler\engine\defaults.py", line 496, in run_step data = next(self._train_data_loader_iter) File "C:\ProgramData\Anaconda3\envs\xmodaler\lib\site-packages\torch\utils\data\dataloader.py", line 517, in __next__ data = self._next_data() File "C:\ProgramData\Anaconda3\envs\xmodaler\lib\site-packages\torch\utils\data\dataloader.py", line 1199, in _next_data return self._process_data(data) File "C:\ProgramData\Anaconda3\envs\xmodaler\lib\site-packages\torch\utils\data\dataloader.py", line 1225, in _process_data data.reraise() File "C:\ProgramData\Anaconda3\envs\xmodaler\lib\site-packages\torch\_utils.py", line 429, in reraise raise self.exc_type(msg) FileNotFoundError: Caught FileNotFoundError in DataLoader worker process 0. Original Traceback (most recent call last): File "C:\ProgramData\Anaconda3\envs\xmodaler\lib\site-packages\torch\utils\data\_utils\worker.py", line 202, in _worker_loop data = fetcher.fetch(index) File "C:\ProgramData\Anaconda3\envs\xmodaler\lib\site-packages\torch\utils\data\_utils\fetch.py", line 44, in fetch data = [self.dataset[idx] for idx in possibly_batched_index] File "C:\ProgramData\Anaconda3\envs\xmodaler\lib\site-packages\torch\utils\data\_utils\fetch.py", line 44, in <listcomp> data = [self.dataset[idx] for idx in possibly_batched_index] File "D:\xmodaler\xmodaler\datasets\common.py", line 42, in __getitem__ data = self._map_func(self._dataset[cur_idx]) File "D:\xmodaler\xmodaler\datasets\images\mscoco.py", line 103, in __call__ content = read_np(feat_path) File "D:\xmodaler\xmodaler\functional\func_io.py", line 22, in read_np content = np.load(path) File "C:\ProgramData\Anaconda3\envs\xmodaler\lib\site-packages\numpy\lib\npyio.py", line 416, in load fid = stack.enter_context(open(os_fspath(file), "rb")) FileNotFoundError: [Errno 2] No such file or directory: '../open_source_dataset/mscoco_dataset/features/up_down\\369199.npz' [09/27 19:21:16 xl.engine.hooks]: Total training time: 0:00:00 (0:00:00 on hooks) [09/27 19:21:16 xl.utils.events]: iter: 0 lr: N/A max_mem: 204M Traceback (most recent call last): File "train_net.py", line 68, in <module> args=(args,), File "D:\xmodaler\xmodaler\engine\launch.py", line 86, in launch main_func(*args) File "train_net.py", line 56, in main return trainer.train() File "D:\xmodaler\xmodaler\engine\defaults.py", line 365, in train super().train(self.start_iter, self.max_iter) File "D:\xmodaler\xmodaler\engine\train_loop.py", line 151, in train self.run_step() File "D:\xmodaler\xmodaler\engine\defaults.py", line 496, in run_step data = next(self._train_data_loader_iter) File "C:\ProgramData\Anaconda3\envs\xmodaler\lib\site-packages\torch\utils\data\dataloader.py", line 517, in __next__ data = self._next_data() File "C:\ProgramData\Anaconda3\envs\xmodaler\lib\site-packages\torch\utils\data\dataloader.py", line 1199, in _next_data return self._process_data(data) File "C:\ProgramData\Anaconda3\envs\xmodaler\lib\site-packages\torch\utils\data\dataloader.py", line 1225, in _process_data data.reraise() File "C:\ProgramData\Anaconda3\envs\xmodaler\lib\site-packages\torch\_utils.py", line 429, in reraise raise self.exc_type(msg) FileNotFoundError: Caught FileNotFoundError in DataLoader worker process 0. Original Traceback (most recent call last): File "C:\ProgramData\Anaconda3\envs\xmodaler\lib\site-packages\torch\utils\data\_utils\worker.py", line 202, in _worker_loop data = fetcher.fetch(index) File "C:\ProgramData\Anaconda3\envs\xmodaler\lib\site-packages\torch\utils\data\_utils\fetch.py", line 44, in fetch data = [self.dataset[idx] for idx in possibly_batched_index] File "C:\ProgramData\Anaconda3\envs\xmodaler\lib\site-packages\torch\utils\data\_utils\fetch.py", line 44, in <listcomp> data = [self.dataset[idx] for idx in possibly_batched_index] File "D:\xmodaler\xmodaler\datasets\common.py", line 42, in __getitem__ data = self._map_func(self._dataset[cur_idx]) File "D:\xmodaler\xmodaler\datasets\images\mscoco.py", line 103, in __call__ content = read_np(feat_path) File "D:\xmodaler\xmodaler\functional\func_io.py", line 22, in read_np content = np.load(path) File "C:\ProgramData\Anaconda3\envs\xmodaler\lib\site-packages\numpy\lib\npyio.py", line 416, in load fid = stack.enter_context(open(os_fspath(file), "rb")) FileNotFoundError: [Errno 2] No such file or directory: '../open_source_dataset/mscoco_dataset/features/up_down\\369199.npz'
Hello, may i ask that have you solved your problem? I had the same problem. Could you ask me how to solve it?