-
Durinig pretraining, after saving checkpoint below error occurs.
```
I0712 06:47:22.892611 140596004366080 tf_logging.py:115] [99000] | gnorm 0.71 lr 0.000001 | loss 7.25 | pplx 1408.25, bpc 10.45…
-
Hi Siqi,
Thanks for releasing the great code.
I cannot find the pretraining code in this repository about the implement of the in-batch negative examples? Could you point out it?
And it seem…
-
Hi,
Thank you very much for the great work, and for making your code publicly available.
I am trying to run the code to reproduce the results, however, the pre-training datasets are missing from …
-
Hey,
I am trying to train Funnel Transformer with the following hparams, the cpu usage for my TPUv3-8 has not gone above 4% in the 90 hours the code has been running and it seems to be very slow, to…
-
I was following [DATA.md](https://github.com/OpenGVLab/Ask-Anything/blob/main/video_chat2/DATA.md) to download pretraining dataset.
However, I cannot find `webvid_10m_train.json`, `cc12m_train.json…
qtli updated
4 months ago
-
Hi, thanks for the nice work.
I'm trying to reproduce paper's result but notice that the hyperparameter you provide in this repositary (by [pretraining script](https://github.com/Hannibal046/PlugLM…
-
Hello. Thank you for your amazing work.
I got a problem when I try to fine tune a pretrained model on my personal dataset(Dataset020). Following the steps in documentation/pretraining_and_finetuning.…
yy042 updated
1 month ago
-
The pretraining example with
```
litgpt pretrain \
--model_name pythia-14m \
--config https://raw.githubusercontent.com/Lightning-AI/litgpt/main/config_hub/pretrain/debug.yaml
```
is do…
-
**Describe the bug**
Runing the Pretraining *BERT* encountered two issues:
1. the "TransformerEngine only supports softmax compute in FP32". Need to add `--attention-softmax-in-fp32` to the model ar…
-
Hi @jhclark-google and @dhgarrette,
I would like to know if there's any chance to get the pretraining code for CANINE.
It's been a long time since the readme was updated and I'm highly intereste…