bookcorpus Search Results

213 results
for bookcorpus

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

NVIDIA/DeepLearningExamples #37

need to run data_download outside of docker due to firewalls

It seems I cannot get docker to correctly access the dns system and resolve IP addresses. Thus I have had to run the data downloads manually. however I cannot find the download_files.py script neede…

David-Levinthal updated 5 years ago
3
google-research/bert #3

Could tensor2tensor support bert?

Could this official repository https://github.com/tensorflow/tensor2tensor support bert?

daiwk updated 5 years ago
5
soskek/bookcorpus #15

Sort by author

Is it possible to sort the downloaded files author-wise here? Thanks!

bakszero updated 5 years ago
2
NVIDIA/DeepLearningExamples #33

The data_download.sh script doesn't process wikipedia

Here are the files I see after going through data download instructions in https://github.com/NVIDIA/DeepLearningExamples/tree/master/TensorFlow/LanguageModeling/BERT The wikipedia directory seems …

yaroslavvb updated 5 years ago
2
huggingface/transformers #534

How many datasets does Bert use in pretraining process?

Hi all, I try to generate the pretraining corpus for BERT with pregenerate_training_data.py. In the BERT paper, it reports about 6M+ instances(segment A+segmentB, less than 512 tokens). But I get 18M…

DecstionBack updated 5 years ago
1
soskek/bookcorpus #12

intermittent issues with connections and file names

example: python3.6 download_files.py --list url_list.jsonl --out out_txts --trash-bad-count 0 files had already been saved in out_txts. File is not a zip file …

David-Levinthal updated 5 years ago
3
zhegan27/ConvSent #1

BookCorpus no longer provides data sets

BookCorpus (http://yknzhu.wixsite.com/mbweb) no longer provides data sets, and I have not found it online. Do you still have a backup of the data set? Can you send me a copy of the data?

funqc updated 5 years ago
3
google-research/bert #155

BERT-Base Chinese data details

Hi, I have some questions about the detail of Chinese BERT-Base model. 1. Is the model trained base on entire Chinese wikipedia raw text ? 2. Are there additional pre-processing steps for raw corp…

htw2012 updated 5 years ago
21
soskek/bookcorpus #3

Network Error

Hi，Thanks for your code, it's really useful for most nlp researchers and thank you again. And when I run this code, it's often interrupted by network error after download a little files, I thought …

SummmerSnow updated 5 years ago
3
dhlee347/pytorchic-bert #1

Pretraining data format and possible corner case of seek_ran…

Hi, thank you very much for the implementation! I'm trying to compare your implementation with the official TF BERT head-to-head with the Gutenberg dataset (since the BookCorpus dataset is no longe…

L0SG updated 5 years ago
2

上一页 1...16 17 18 19 20 21 22...22 下一页

213 results for bookcorpus

213 results
for bookcorpus