-
* Intro
* motivation, why improtant
* Related work
* serengeti and others that did similar things
* Dataset
* What is it, difficulties and so on
* Pretrai…
-
it seems pre-train corpus using whole word mask is not support in chinese yet.
even passing --do_whole_word_mask=True using create_pretraining_data.py, nothing happens.
is there someone know ho…
-
How many epochs do we need to train in order to make the model generate reasonable ouputs?
I am currently training the MAE model from scratch, with 4000 images (the dataset is very small though). Bu…
-
Hi, I have been trying to run the run_evaluation.sh with the provided checkpoints downloaded and unzipped to the checkpoints directory. I am running into this error:
evaluate.py: error: argument -…
-
We need a table to summarize all the code LMs that we test in this project.
An example of this (from [this paper](https://arxiv.org/pdf/2303.18223.pdf)):
Some more code-related features we wo…
-
https://arxiv.org/abs/2207.03208
-
Hello, I am using the blip2_t5 model (model_type="pretrain_flant5xxl") to predict answers for a given input. I provide a list of answer candidates to the model, but the model still predicts answers th…
-
**Replace:**
Pretraining data consists of thousands, or even millions, of individual documents, often web scraped. Model knowledge and behavior will likely reflect a compression of this information…
-
### System Info
4x NVIDIA H100, TensorRT-LLM backend 0.9.0
### Who can help?
@Tracin
### Information
- [X] The official example scripts
- [ ] My own modified scripts
### Tasks
-…
-
Tensorflow:1.12
python3.6
when run run_pretraining.py, i meet the error. And when I set the max_predictions_per_seq=5, no error; but when set the max_predictions_per_seq=10, the error happens.
yw411 updated
3 years ago