microsoft / unilm

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
https://aka.ms/GeneralAI
MIT License
19.61k stars 2.5k forks source link

LayoutLM on google Colab #234

Open lindeberg25 opened 4 years ago

lindeberg25 commented 4 years ago

I'm trying to run LayoutLM on google Colab

%%bash source /usr/local/etc/profile.d/conda.sh conda activate layoutlm conda install pytorch==1.4.0 cudatoolkit=10.1 -c pytorch --yes

I can confirm that the dependencies have been installed:

!conda list -n layoutlm

pytorch 1.4.0 cudatoolkit 10.1.243

I also installed the torch, since the apex requires it (!pip install torch===1.4.0)

However, when I run:

%cd "/content/drive/My Drive/Colab Notebooks/unilm/layoutlm/examples/seq_labeling"

!python run_seq_labeling.py --data_dir ../data \ --model_type layoutlm \ --model_name_or_path ../layoutlm-base-uncased \ --do_lower_case \ --max_seq_length 512 \ --do_train \ --num_train_epochs 100.0 \ --logging_steps 10 \ --save_steps -1 \ --output_dir output13 \ --labels ../data/labels.txt \ --per_gpu_train_batch_size 16 \ --per_gpu_eval_batch_size 16 \ --fp16

I got this error: RuntimeError: version_ <= kMaxSupportedFileFormatVersion INTERNAL ASSERT FAILED at /pytorch/caffe2/serialize/inline_container.cc:132, please report a bug to PyTorch. Attempted to read a PyTorch file with version 3, but the maximum supported version for reading is 2. Your PyTorch installation may be too old. (init at /pytorch/caffe2/serialize/inline_container.cc:132)

wolfshow commented 4 years ago

@lindeberg25, you may refer to a starter page from Kaggle (https://www.kaggle.com/jpmiller/layoutlm)

kbrajwani commented 4 years ago

@wolfshow @ranpox the kaggle starter notebook is not working please make a colab notebook it would be very appreciable.

lindeberg25 commented 4 years ago

Thanks for replying, guys !

I've solved the issue using torch==1.5.0+cu101( !pip install torch==1.5.0+cu101) instead of pytorch==1.4.0