-
That sounds massively interesting, and while we try to run inference and read the paper, should we expect the release of the finetuning code?
-
你好请问一下,是否还没有完整代码释出,还是已经删除了,因为和README的描述内容不太一样,谢谢
-
We can add a few examples:
- Token Classification with BERT
**Dataset:** CoNLL 2003
**What's different?** Here, we have to classify every word into its NER type. However, since BERT tokenises tex…
-
Hi, could you please share some caption examples for pretraining on Audioset? I'm a little confused about the [mask] token setting for clip text encoder.
-
Related to **Bert/Pytorch**
**Describe the bug**
After running a long period, for example, after 200,000 iterations, there will be some skipped steps. Such skipped steps are counted into the tota…
-
"one of my other major bottlenecks is pretraining weights – I’ve been training MATCH from random weight initializations every time, whereas with models like GPT-2 people just take the pretrained weigh…
-
Hi all,
I have recently done an implementation of BYOL to do the pretraining of ResNet50 for a semantic segmentation task, but was not too satisfied with the results, and then I bumped into your pape…
-
http://aclweb.org/anthology/D17-1039
-
Hi,
Very simple issue, this error:
"ValueError: loaded state dict contains a parameter group that doesn't match the size of optimizer's group"
Is displayed when I'm trying to load a a pre-trained m…
-
self.context_mlm_trans and self.context_order_trans are expecting a different key-structure
RuntimeError: Error(s) in loading state_dict for BertPredictionHeadTransform:
Missing key(s) in stat…