Open AIstudentSH opened 4 years ago
Hi,
It is due to the model is not correctly initialized here in lxrt::entry.py. More precisely, the reason is that the BERT cached file is not successfully created here in lxrt::modeling.py.
Could you please check whether the cached dir here is accessible and this url https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased.tar.gz
is available.
Hi,
It is due to the model is not correctly initialized here in lxrt::entry.py. More precisely, the reason is that the BERT cached file is not successfully created here in lxrt::modeling.py.
Could you please check whether the cached dir here is accessible and this url
https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased.tar.gz
is available.
Why still need BERT? In finetune stage, it's seems BERT weights are reloaded from pretrain model(in entry.py: 126)
Hi,
It is due to the model is not correctly initialized here in lxrt::entry.py. More precisely, the reason is that the BERT cached file is not successfully created here in lxrt::modeling.py.
Could you please check whether the cached dir here is accessible and this url
https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased.tar.gz
is available.
Thank you for answering. The reason I found it more direct is that there are two programs running on the same gpu, and the memory is not enough.
Hi, It is due to the model is not correctly initialized here in lxrt::entry.py. More precisely, the reason is that the BERT cached file is not successfully created here in lxrt::modeling.py. Could you please check whether the cached dir here is accessible and this url
https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased.tar.gz
is available.Why still need BERT? In finetune stage, it's seems BERT weights are reloaded from pretrain model(in entry.py: 126)
The BERT weights are loaded to allow reproducing the numbers "BERT + X CrossAtt", "Train + BERT", and "Pre-train + BERT" in Table 3 of our paper. They would be overwritten when LXMERT weights are loaded as you pointed out.
Hi, It is due to the model is not correctly initialized here in lxrt::entry.py. More precisely, the reason is that the BERT cached file is not successfully created here in lxrt::modeling.py. Could you please check whether the cached dir here is accessible and this url
https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased.tar.gz
is available.Thank you for answering. The reason I found it more direct is that there are two programs running on the same gpu, and the memory is not enough.
Glad that you find this reason and it would be definitely useful to any other users with the same issue!
Hi, It is due to the model is not correctly initialized here in lxrt::entry.py. More precisely, the reason is that the BERT cached file is not successfully created here in lxrt::modeling.py. Could you please check whether the cached dir here is accessible and this url https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased.tar.gz is available.
Thank you for answering. The reason I found it more direct is that there are two programs running on the same gpu, and the memory is not enough.
I also meet this problem. And it occurs irregularly in the process. I've checked that the gpu is empty..Don't know how to solve it yet..
Thanks for Qizhou Shuai to help test. I found that this error is due to the low bandwidth to AWS where the BERT configs and weights are originally saved. So the urllib considers it as an out-of-time connection and disconnects.
To solve it, I made a copy on my server and please replace the following line at link
'bert-base-uncased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased.tar.gz"
with
'bert-base-uncased': "https://nlp.cs.unc.edu/data/bert/bert-base-uncased.tar.gz"
Please let me know if the speed to my server is still slow.
Thanks for your answer! I tried to use the new download link, and it really helped!
Thanks for Qizhou Shuai to help test. I found that this error is due to the low bandwidth to AWS where the BERT configs and weights are originally saved. So the urllib considers it as an out-of-time connection and disconnects.
To solve it, I made a copy on my server and please replace the following line at link
'bert-base-uncased': "https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased.tar.gz"
with
'bert-base-uncased': "https://nlp.cs.unc.edu/data/bert/bert-base-uncased.tar.gz"
Please let me know if the speed to my server is still slow.
404 Not Found error for url "https://nlp.cs.unc.edu/data/bert/bert-base-uncased.tar.gz" Not working now.
The link was just fixed. Could you help check it again?
The link was just fixed. Could you help check it again?
The link under nlp.cs.unc.edu is working now! A lot of thanks! Though I tried "https://nlp1.cs.unc.edu/data/bert/bert-base-uncased.tar.gz" earlier and it worked.
Hello,thanks for your share. I update the link "https://nlp.cs.unc.edu/data/bert/bert-base-uncased.tar.gz". But it did not work yet.
This is information.
Model name 'bert-base-uncased' was not found in model name list (bert-base-uncased, bert-large-uncased, bert-base-cased, bert-large-cased, bert-base-multilingual-uncased, bert-base-multilingual-cased, bert-base-chinese). We assumed 'https://s3.amazonaws.com/models.huggingface.co/bert/bert-base-uncased-vocab.txt' was a path or url but couldn't find any file associated to this path or url. Load 632117 data from split(s) train,nominival. Start to load Faster-RCNN detected objects from data/mscoco_imgfeat/train2014_obj36.tsv Loaded 512 images in file data/mscoco_imgfeat/train2014_obj36.tsv in 2 seconds. Start to load Faster-RCNN detected objects from data/mscoco_imgfeat/val2014_obj36.tsv Loaded 512 images in file data/mscoco_imgfeat/val2014_obj36.tsv in 2 seconds. Use 2888 data in torch dataset
Load 25994 data from split(s) minival. Start to load Faster-RCNN detected objects from data/mscoco_imgfeat/val2014_obj36.tsv Loaded 512 images in file data/mscoco_imgfeat/val2014_obj36.tsv in 2 seconds. Use 2618 data in torch dataset
The BERT-weight-downloading query to AWS was time-out;trying to download from UNC servers
The weight-downloading still crashed with link: https://nlp.cs.unc.edu/data/bert/bert-base-uncased.tar.gz, please check your network connection
Traceback (most recent call last):
File "src/tasks/vqa.py", line 178, in
Can you help me? Thanks.
Traceback (most recent call last): File "src/tasks/vqa.py", line 178, in
vqa = VQA()
File "src/tasks/vqa.py", line 48, in init
self.model = VQAModel(self.train_tuple.dataset.num_answers)
File "/home/shaohuan/lxmert/src/tasks/vqa_model.py", line 32, in init
self.logit_fc.apply(self.lxrt_encoder.model.init_bert_weights)
AttributeError: 'NoneType' object has no attribute 'init_bert_weights'