-
Thanks for this amazing code base!
I am a newbie to understand this code-base especially the "pretrain from scratch".
1.
What kind of public dataset I can use ? am I suppose to use wiki and …
-
## 🚀 Feature
The current design in torchtext presents the user with two APIs for dataset construction:
- the "raw" API, which returns the raw text data from the dataset, and
- the one-liner build…
-
### Feature request
Extract the spiking nature of the LLM and port that [set] of features over for training/inference,.
https://github.com/ridgerchu/SpikeGPT
### Motivation
the benefits would r…
-
I'm checking and converting the wiki data for pre-trainning, just like below:
![11_06_17__01_09_2019](https://user-images.githubusercontent.com/5104916/50874643-d1441900-13ff-11e9-8193-8bd73e7960c9…
-
How can I train a BERT model from scratch?
-
I tried to pretrain Longformer using transformers and datasets. But I got OOM issues with loading a large text file. My script is almost like this:
```python
from datasets import load_dataset
@…
-
RoBERTa Corpus is a combination of multiple sources, did not perform any form of filtering?
The bookcorpus dataset alone has 74M rows, but I saw that your Roberta folder is named 20M. May I ask what …
-
## ざっくり言うと
BERT以降,モデルサイズを大きくすることで精度の向上を図るトレンドがあるが,その潮流とは異なり,パラメータ数の削減を目的とした新モデルの提案(ALBERTはA Lite BERTの略).同じモデル構成で比較すると精度は落ちるが,パラメータ数が少ない分モデルを大きくすることが可能になり,結果としてBERT largeとほぼ同じ性能のモデルが約1/5のパラメータ数で達成された…
-
**Some testing using a VPN connection to a number of points of presence (pops) around the world: the limit is 50 books per IP per day now, making the effort to recompile the Toronto Corpus a painful, …
ghost updated
4 years ago
-
## 論文リンク
https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf
## 公開日(yyyy/mm/dd)
2018/06/11
## 概要
いわゆる GPT-1 の論文。
言語モデルの教師なし事前学習がその後の教師あり fine-tuning…