-
# My question
Why does catastrophic forgetting occur when I perform continued pre-training on Llama 3? I used open source data from BookCorpus, iterated 100,000 steps, and then after testing the trai…
-
I try to run the code on a small dataset and I find that pred_loss decrease fast while avg_acc stay at 50%. It is strange to me since decrease in pred_loss should indicates increase in accuracy.
![im…
-
I think there is a bug in standard sentence tokenizer `sent_tokenize`. The problem is, that it is not splitting text into sentences under certain case. Here is this case, where the tokenizer fails to …
-
I am trying to train the ssd_inception_v2
the training break with the error
the result error:
ResourceExhaustedError (see above for traceback): OOM when allocating tensor with shape[1917,1] …
-
How to download the dolly or RoBERTa Corpus dataset?
Please give the url. Thanks~
-
### Describe the bug
This bug is triggered under the following conditions:
- datasets repo ids without organization names trigger errors, such as `bookcorpus`, `gsm8k`, `wikipedia`, rather than …
-
I'm learning how to train a language model from scratch, and I was training a 120M tinyLlama model with bookcorpus. I wonder how I can evaluate the checkpoints using GLUE. I have read EVAL.md which s…
-
Hi, I'm running into the following error when attempting to train bert with ds_train_bert_bsz64k_seq128_m.sh. I printed out all tensor shapes in the batch and it looks fine since I used train_micro_ba…
-
The pile and its BookCorpus subset are not available. I am also unable to download this dataset for pretrain gpt. Is there any other dataset to replace it, or is there a backup of the previous dataset…
xvanQ updated
8 months ago
-
The link (https://the-eye.eu/public/AI/pile_neox/data/BookCorpusDataset_text_document.bin) has expired.
![image](https://github.com/microsoft/Megatron-DeepSpeed/assets/41630003/916ca98b-0324-44de-928…