bookcorpus Search Results

215 results
for bookcorpus

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

microsoft/DeepSpeed #691

How to specify wikipedia_en and bookscorpus path in nvidia_b…

Hi, Deepspeed team! I am trying to run Bert-Pretraining with deepspeed. After preprocessing the wikipedia_en dataset and bookscorpus dataset, I specified the path in bert_large_lamb_nvidia_data.json, …

TonyTangYu updated 1 year ago
2
msakarvadia/AttentionLens #11

Error when training in train_pl.py

Running this on my Mac with CPU gives: ``` Dataset bookcorpus downloaded and prepared to /Users/arhamkhan/.cache/huggingface/datasets/bookcorpus/plain_text/1.0.0/eddee3cae1cc263a431aa98207d4d27fd8…

123epsilon updated 1 year ago
6
huggingface/datasets #406

Faster Shuffling?

Consider shuffling bookcorpus: ``` dataset = nlp.load_dataset('bookcorpus', split='train') dataset.shuffle() ``` According to tqdm, this will take around 2.5 hours on my machine to complete (ev…

mitchellgordon95 updated 1 year ago
7
huggingface/transformers #29532

AttributeError: 'LlamaTokenizer' object has no attribute 'un…

### System Info transformers==4.29.0 ### Who can help? _No response_ ### Information - [ ] The official example scripts - [ ] My own modified scripts ### Tasks - [ ] An officially supported tas…

qxpBlog updated 7 months ago
12
gpauloski/kfac-pytorch #117

Using K-FAC training BERT-Large model

Hi, gpauloski. Thanks for helping. Recently, we have done some experiments, and the following questions need to be solved. 1. (**Without K-FAC, e.g., using LAMB in phase 1 and phase 2**): In p…

RuidongYan666 updated 1 year ago
3
horseee/LLM-Pruner #11

Use LLM-Pruner for Baichuan model

Hi, I am trying to use LLM-Pruner on Baichuan-13B model (https://github.com/baichuan-inc/Baichuan-13B). It is also llama structured so I thought it should work instantly, but I got some errors... I am…

Daisy5296 updated 1 year ago
15
JonasGeiping/cramming #31

Tutorial for pretrain RoBERTa with custom data

Hmm, This may seem a bit excessive, but I'm a bit confused and don't know how to preprocess the data and train a RoBERTa model. Can you do a basic step by step tutorial for me? Looks like I'm also l…

iambestfeeddddd updated 1 year ago
2
thu-coai/PICL #5

About corpus processing

Wonderful work and thanks very much for your contribution! I'm running the step 3.1 of corpus processing through the following command: ``` bash scripts/tools/process_full_doc_data_gpt2.sh ${BASE…

SparkJiao updated 1 year ago
5
huggingface/dataset-viewer #1582

PreviousStepFormatError on sil-ai/bloom-speech

For a lot of configs in https://huggingface.co/datasets/sil-ai/bloom-speech, we get PreviousStepFormatError.

severo updated 1 year ago
10
kwonmha/bert-vocab-builder #1

If I change the min_count flag in order to produce vocab of …

maggieezzat updated 1 year ago
8

上一页 1...7 8 9 10 11 12 13...22 下一页

215 results for bookcorpus

215 results
for bookcorpus