Open hhou435 opened 2 years ago
Word-based pretraining with sentencepiece
python3 preprocess.py --corpus_path corpora/book_review.txt \ --spm_model_path models/cluecorpussmall_spm.model \ --dataset_path book_review_word_sentencepiece_dataset.pt \ --processes_num 8 --seq_length 128 --dynamic_masking \ --data_processor mlm python3 pretrain.py --dataset_path book_review_word_sentencepiece_dataset.pt \ --spm_model_path models/cluecorpussmall_spm.model \ --output_model_path models/book_review_word_sentencepiece_model.bin \ --world_size 8 --gpu_ranks 0 1 2 3 4 5 6 7 \ --total_steps 5000 --save_checkpoint_steps 2500 --report_steps 500 \ --learning_rate 1e-4 --batch_size 64 \ --tie_weights
Report the following error Segmentation fault
Segmentation fault
Word-based pretraining with sentencepiece
Report the following error
Segmentation fault