bookcorpus Search Results

211 results
for bookcorpus

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

guolinke/TUPE #9

some issues

作者您好。心中有几个疑问，希望您能不吝赐教 1.pre-train上来就是一堆超参（这些超参在哪个文件里面的）;pre-train部分的最后一句是训练把，而且后面带了一堆参数？到底我要输入什么指令从而接下去运行。 2.我的服务器只有一个gpu，要运行你的代码，是不是要改一些配置？但是到底要改哪些参数 3.数据集的路径在哪个文件，没看到有 4."we use the English Wiki…

sanwei111 updated 3 years ago
1
xplip/pixel #9

Unable to Load Data

``` File "scripts/training/run_pretraining.py", line 465, in preprocess_images examples["pixel_values"] = [transforms(image) for image in examples[image_column_name]] # bytes, path File "s…

ChawDoe updated 1 year ago
2
google-research/bert #380

Evaluation of pretrained model with Gutenberg dataset on NSP…

We are trying to evaluate the pretrained model on data from Gutenberg (around 3000 books) but we are not able to get anyway close to the result from the paper where you achieved 97-98 % for the Next S…

batrlaluk updated 5 years ago
1
huggingface/datasets #315

[Question] Best way to batch a large dataset?

I'm training on large datasets such as Wikipedia and BookCorpus. Following the instructions in [the tutorial notebook](https://colab.research.google.com/github/huggingface/nlp/blob/master/notebooks/Ov…

jarednielsen updated 3 years ago
11
google-research/bert #1025

BERT pretraining num_train_steps questions

Hello, I would like to confirm how the number of training steps and hence the number of epochs used in the paper for pretraining BERT is calculated. From the paper, I deduced (kindly correct me …

MarwahAhmadHelaly updated 4 years ago
3
huggingface/datasets #2498

Improve torch formatting performance

**Is your feature request related to a problem? Please describe.** It would be great, if possible, to further improve read performance of raw encoded datasets and their subsequent conversion to torch…

vblagoje updated 2 years ago
17
ryankiros/skip-thoughts #38

any pre-trained decoders apart of neural-storyteller?

prokopevaleksey updated 6 years ago
6
google-research/albert #178

Getting huge number of training steps

I have generated pretraining data using [https://github.com/kamalkraj/ALBERT-TF2.0](url) because this supports training with multi GPU. I am doing this for the Hindi language with 22gb of data. Gener…

008karan updated 4 years ago
5
shaigue/pmi_masking #29

deal with wikipedia bug

When trying to use the wikipedia dataset I get the BUG: ``` Traceback (most recent call last): File "C:\repos\pmi_masking\create_pmi_masking_vocab.py", line 188, in main() File "C:\repos\pmi_…

shaigue updated 1 year ago
1
google-research/bert #341

Wiki Data Formation Problem, Need Sentence Split

I'm checking and converting the wiki data for pre-trainning, just like below: ![11_06_17__01_09_2019](https://user-images.githubusercontent.com/5104916/50874643-d1441900-13ff-11e9-8193-8bd73e7960c9…

zheolong updated 3 years ago
33

上一页 1...2 3 4 5 6 7 8...22 下一页

211 results for bookcorpus

211 results
for bookcorpus