how to change models - Githubissues

ydyrx-ldm commented 2 years ago

您好，抱歉我用中文问您问题。我想修改一下您的模型，比如我想修改updown模型，请问我应该在那个文件.py修改模型呢？我的想法是ImageCaptioning.pytorch-master\captioning\models\AttModel.py里面的这个部分？ `class UpDownCore(nn.Module): def init(self, opt, use_maxout=False): super(UpDownCore, self).init() self.drop_prob_lm = opt.drop_prob_lm

    self.att_lstm = nn.LSTMCell(opt.input_encoding_size + opt.rnn_size * 2, opt.rnn_size) # we, fc, h^2_t-1
    self.lang_lstm = nn.LSTMCell(opt.rnn_size * 2, opt.rnn_size) # h^1_t, \hat v
    self.attention = Attention(opt)

def forward(self, xt, fc_feats, att_feats, p_att_feats, state, att_masks=None):
    prev_h = state[0][-1]
    att_lstm_input = torch.cat([prev_h, fc_feats, xt], 1)

    h_att, c_att = self.att_lstm(att_lstm_input, (state[0][0], state[1][0]))

    att = self.attention(h_att, att_feats, p_att_feats, att_masks)

    lang_lstm_input = torch.cat([att, h_att], 1)
    # lang_lstm_input = torch.cat([att, F.dropout(h_att, self.drop_prob_lm, self.training)], 1) ?????

    h_lang, c_lang = self.lang_lstm(lang_lstm_input, (state[0][1], state[1][1]))

    output = F.dropout(h_lang, self.drop_prob_lm, self.training)
    state = (torch.stack([h_att, h_lang]), torch.stack([c_att, c_lang]))

    return output, state`

是不是仅仅修改这个地方的代码就能修改updown的模型？我还有个疑问，我试了一下随便修改上面的代码，好像这个也是能运行？这是什么情况？希望您的回复，谢谢。

ruotianluo commented 2 years ago

只要core input output的形式不变，就能运行

ydyrx-ldm commented 2 years ago

我记得前天直接把self.lang_lstm = nn.LSTMCell(opt.rnn_size * 2, opt.rnn_size) # h^1_t, \hat v修改成self.lang_lstm = nn.LSTMCell(opt.rnn_size * 3, opt.rnn_size) # h^1_t, \hat v其他都没变还是能运行。这不应该是维度出现问题吗？

ydyrx-ldm commented 2 years ago

如果我确实修改了这个class UpDownCore(nn.Module):里的内容，我应该怎么保证我的代码被运行呢？如果没有报错，是否代表全部执行？比如我修改updown模型，然后运行updown模型训练。然后好像只要core input output的形式不变，改变里面的一些代码也不会出错。这使我疑惑，期待您的回复。

ruotianluo commented 2 years ago

我也有点疑惑。。。。我最近比较忙。你过两天再来提醒我一下。

ydyrx-ldm commented 2 years ago

好的。谢谢啦

ydyrx-ldm commented 2 years ago

Hello,我来提醒您关注下这个问题了。

ruotianluo commented 2 years ago

你给我一下你的training script？

ydyrx-ldm commented 2 years ago

好的，我的training script是 python tools/train.py --id updown --caption_model updown --input_json data/cocotalk.json --input_fc_dir data/cocotalk_fc --input_att_dir data/cocotalk_att --input_label_h5 data/cocotalk_label.h5 --batch_size 128 --learning_rate 5e-4 --learning_rate_decay_start 0 --scheduled_sampling_start 0 --checkpoint_path log_updown --save_checkpoint_every 6000 --val_images_use 5000 --max_epochs 30 值得注意的是：我使用的图像特征是resnet 101产生的而不是Faster rcnn产生的。

ydyrx-ldm commented 2 years ago

您好，请问有什么进展吗？是哪部分有问题呀？

ruotianluo commented 2 years ago

我改了那一行之后抱错了。。

ruotianluo commented 2 years ago

  File "/share/data/vision-greg/rluo/caption/tmp/captioning/models/AttModel.py", line 636, in forward
    h_lang, c_lang = self.lang_lstm(lang_lstm_input, (state[0][1], state[1][1]))
  File "/share/data/vision-greg/rluo/local/anaconda3/envs/virtex2/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "/share/data/vision-greg/rluo/local/anaconda3/envs/virtex2/lib/python3.7/site-packages/torch/nn/modules/rnn.py", line 1058, in forward
    self.bias_ih, self.bias_hh,
RuntimeError: input has inconsistent input_size: got 1024 expected 1536

ruotianluo commented 2 years ago

你要确认core跑了就在forward里面print一个东西

ydyrx-ldm commented 2 years ago

你要确认core跑了就在forward里面print一个东西

是的，我这样做了（forward里面print一个东西），然而什么都没有输出，会不会我的AttModel.py文件内容不一样？还是说：python tools/train.py --id updown --caption_model updown --input_json data/cocotalk.json --input_fc_dir data/cocotalk_fc --input_att_dir data/cocotalk_att --input_label_h5 data/cocotalk_label.h5 --batch_size 128 --learning_rate 5e-4 --learning_rate_decay_start 0 --scheduled_sampling_start 0 --checkpoint_path log_updown --save_checkpoint_every 6000 --val_images_use 5000 --max_epochs 30 出现问题？

ydyrx-ldm commented 2 years ago

还是说我的错误出现在用的是resnet 101产生的，而不是Faster rcnn产生的特征？我的代码都是用您发布最新的。 and, 我的training script是也是按照说明书的要求。

ruotianluo commented 2 years ago

我是重新clone的master，然后跑的你的命令。

ruotianluo commented 2 years ago

你试试pip uninstall captioning?

ydyrx-ldm commented 2 years ago

ok，我试试。

ydyrx-ldm commented 2 years ago

我还是出现了问题：我也是重新clone 了master 的ImageCaptioning.pytorch项目，clone cider,clone coco-caption 然后Prepare data，先 python scripts/prepro_labels.py --input_json data/dataset_coco.json --output_json data/cocotalk.json --output_h5 data/cocotalk 再准备图像特征 python scripts/prepro_feats.py --input_json data/dataset_coco.json --output_dir data/cocotalk --images_root $IMAGE_ROOT 又把self.lang_lstm= nn.LSTMCell(opt.rnn_size * 2, opt.rnn_size) 修改成self.lang_lstm = nn.LSTMCell(opt.rnn_size * 3, opt.rnn_size)其他都没变还是能运行。

我的training script是CUDA_VISIBLE_DEVICES=3 python tools/train.py --id updown --caption_model updown --input_json data/cocotalk.json --input_fc_dir data/cocotalk_fc --input_att_dir data/cocotalk_att --input_label_h5 data/cocotalk_label.h5 --batch_size 10 --learning_rate 5e-4 --learning_rate_decay_start 0 --scheduled_sampling_start 0 --checkpoint_path log_updown --save_checkpoint_every 6000 --val_images_use 5000 --max_epochs 30

ruotianluo commented 2 years ago

你试试看直接下载我处理好的data？虽然我不觉得是这个问题。你pip uninstall captioning了吗

ydyrx-ldm commented 2 years ago

pip uninstall captioning WARNING: Skipping captioning as it is not installed.

其实我是在其他地方创建文件夹，从头到尾新建了一个项目，重新clone 了master 的ImageCaptioning.pytorch项目，clone cider,clone coco-caption。所以我觉得captioning也是最新的了吧？

就是不清楚问题出在哪里。这是我运行之后的提示：

Hugginface transformers not installed; please visit https://github.com/huggingface/transformers
meshed-memory-transformer not installed; please run `pip install git+https://github.com/ruotianluo/meshed-memory-transformer.git`
DataLoader loading json file:  data/cocotalk.json
vocab size is  9487
DataLoader loading h5 file:  data/cocotalk_fc data/cocotalk_att data/cocotalk_box data/cocotalk_label.h5
max sequence length in data is 16
read 123287 image features
assigned 113287 images to split train
assigned 5000 images to split val
assigned 5000 images to split test
2021-09-13 21:03:24.513746: W tensorflow/stream_executor/platform/default/dso_loader.cc:60] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared 
object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/lib:2021-09-13 21:03:24.513818: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
Read data: 0.0002644062042236328
iter 13403 (epoch 1), train_loss = 2.422, time/batch = 0.580
Read data: 0.00020313262939453125
iter 13404 (epoch 1), train_loss = 2.629, time/batch = 0.093
Read data: 0.0001647472381591797
iter 13405 (epoch 1), train_loss = 2.330, time/batch = 0.089
Read data: 0.00012564659118652344

ruotianluo commented 2 years ago

我之所以说uninstall，因为你说print没有用，我觉得有可能是用的其他地方的代码。要不你改了，print就应该显示呀。

ydyrx-ldm commented 2 years ago

新建了一个文件夹，在新的文件夹下运行，应该不太可能运行到其他地方的代码了吧？

ruotianluo commented 2 years ago

现在print还是没用吗

ydyrx-ldm commented 2 years ago

没有，我改的是： class UpDownCore(nn.Module): def init(self, opt, use_maxout=False): super(UpDownCore, self).init() self.drop_prob_lm = opt.drop_prob_lm

    self.att_lstm = nn.LSTMCell(opt.input_encoding_size + opt.rnn_size * 2, opt.rnn_size) # we, fc, h^2_t-1
    self.lang_lstm = nn.LSTMCell(opt.rnn_size * 3, opt.rnn_size) # h^1_t, \hat v
    self.attention = Attention(opt)

def forward(self, xt, fc_feats, att_feats, p_att_feats, state, att_masks=None):
    prev_h = state[0][-1]
    att_lstm_input = torch.cat([prev_h, fc_feats, xt], 1)

    h_att, c_att = self.att_lstm(att_lstm_input, (state[0][0], state[1][0]))

    print("-------------------------------------------------")

    att = self.attention(h_att, att_feats, p_att_feats, att_masks)

    lang_lstm_input = torch.cat([att, h_att], 1)
    # lang_lstm_input = torch.cat([att, F.dropout(h_att, self.drop_prob_lm, self.training)], 1) ?????

    h_lang, c_lang = self.lang_lstm(lang_lstm_input, (state[0][1], state[1][1]))

    output = F.dropout(h_lang, self.drop_prob_lm, self.training)
    state = (torch.stack([h_att, h_lang]), torch.stack([c_att, c_lang]))

    return output, state

ydyrx-ldm commented 2 years ago

我需不需要换台电脑尝试一下？或者换个服务器？我用的是linux系统服务器，好像不能debug

ruotianluo commented 2 years ago

为啥linux不能debug？？？

ruotianluo commented 2 years ago

你可以装一个pudb，可以命令行debug

ruotianluo commented 2 years ago

pip install就可以了

ruotianluo commented 2 years ago

你还是先换个电脑试试

ydyrx-ldm commented 2 years ago

就是我的linux系统没有界面的，没有像桌面那种软件。代码命令进行debug我不太会好的，我明天换电脑试试。

ydyrx-ldm commented 2 years ago

我刚尝试了一下把/captioning/model文件夹整个删除，还是能运行。应该是运行到别处的代码了？

ydyrx-ldm commented 2 years ago

有没有可能是我没有用多线程GPU？我只是用了一块GPU。CUDA_VISIBLE_DEVICES=3 明天早上我再去尝试实验，谢谢您，晚安。

ydyrx-ldm commented 2 years ago

你好，我好像解决了，原因果然是调用了其他文件位置（同一路径下的其他文件夹）的captioning。当我把所有的master项目删除。重新安装就能输出print了。也能修改model了。谢谢您。

另外我有一个问题：一般您的设置参数--learning_rate、--max_epochs设置为多少呀？是 5e-4 和30epoch?

ruotianluo / ImageCaptioning.pytorch

how to change models #133