-
I found `seq *= self.item_emb.embedding_dim ** 0.5` in function `log2feats(self, log_seqs)`, Is there any reason for adjusting seqs after embedding?
`seqs = self.item_emb(torch.LongTensor(log_seqs)…
-
Hi Tianhong, thank you for your inspiring work! While reading the paper, I had some questions regarding the term “MAR.” Aside from the difference mentioned in the paper—where the next set of tokens in…
-
Hi I ran 10 fold cross validation test as shown in repo60,62, but gain lower results:
twitter:
learning rate 2e-5,mean_test_acc: 0.7095, mean_test_f1: 0.6925
learning rate 5e-5,mean_test_acc: 0.62…
-
### 🐛 Describe the bug
The `fuse_attention` pattern 16 makes Bert CUDA go into efficient attention instead of SDPA math, causing accuracy issue. Found in https://github.com/pytorch/pytorch/pull/113…
-
I found Contriever quite interesting based on the table 3 of the paper (few-shot retrieval) as Contriever-MSMarco achieves a score of 38.1 when finetuned on FiQA, which is much higher than the BERT-MS…
-
### 用LiBai的Bert加载huggingface的权重对齐输出发现的一些问题,经过修改后可以与hugigngface输出对齐
#### 参数结构对比,可以先看最下面两个库中`Bert`的参数结构:
- **LiBai**的`embedding`部分和**huggingface**的没问题。
- 然后,看`LayerNorm`层,我们**LiBai**的`LayerNorm`层放在…
-
https://github.com/facebookresearch/TaBERT/blob/cf5351c697773573a4fd857e3dde7f66cc6e6dd9/table_bert/vertical/input_formatter.py#L82-L83
Thanks for open-sourcing the project. In the paper, you descr…
-
I am trying to do multi-class sequence classification using the BERT uncased base model and tensorflow/keras. However, I have an issue when it comes to labeling my data following the BERT wordpiece to…
-
# 1. 모델 개요
![image](https://user-images.githubusercontent.com/45033215/135703411-fe901bac-ed56-411f-932f-7ac89f4c3cd1.png)
* 훈련 배치 사이즈 : 32
* 검증 배치 사이즈 : 128
* 학습률 : 5e-5
* warmup_steps : 500
* …
-
Hello, I'm an undergraduate trying to run the code from your paper "Attentional Encoder Network for Targeted Sentiment Classification", thank you so much for your work, but I am having trouble getting…