When I run inference/prediction for the LayoutLM Sequence Labelling Task on a custom test data, I receive the following error. This also happens when I run evaluation on my custom testing data. If I just use the default training and testing set, everything runs smoothly. Could anyone please advise me on the correct steps to implement this, as I am currently stuck? I have tried multiple methods, which all result in the same error.
Hi,
When I run inference/prediction for the LayoutLM Sequence Labelling Task on a custom test data, I receive the following error. This also happens when I run evaluation on my custom testing data. If I just use the default training and testing set, everything runs smoothly. Could anyone please advise me on the correct steps to implement this, as I am currently stuck? I have tried multiple methods, which all result in the same error.
Please note that based on the following link, I have made some minor edits to run_seq_labeling.py https://github.com/microsoft/unilm/issues/152
Method 1a) Step 1) Train. Please note that the folder "data" contains training data and original testing data.
python run_seq_labeling.py --data_dir data \ --model_type layoutlm \ --model_name_or_path path/to/pretrained/model/directory \ --do_lower_case \ --max_seq_length 512 \ --do_train \ --num_train_epochs 100.0 \ --logging_steps 10 \ --save_steps -1 \ --output_dir path/to/output/directory \ --labels data/labels.txt \ --per_gpu_train_batch_size 16 \ --per_gpu_eval_batch_size 16 \ --fp16
Step 2) Run inference/prediction. Please note that the folder "data1" contains custom testing data.
python run_seq_labeling.py --do_predict \ --data_dir data1 \ --model_type layoutlm \ --model_name_orpath output \ --do_lower_case \ --output_dir predictions1 \ --labels data1/labels.txt \
Result: Assertion Error
Method 1b) Step 1) Train
Step 2) Evaluate
python run_seq_labeling.py --data_dir infer_data \ --model_type layoutlm \ --model_name_or_path output_method1 \ --do_lower_case \ --do_eval \ --output_dir output_eval1 \ --labels infer_data/labels.txt \ Result: Assertion Error Step 3) Infer (Can't do this step due to error in Step 2)
Method 2a:
Step 1) Train. With testing data in data_dir being REPLACED with infer (custom) data python run_seq_labeling.py --data_dir data_method2 \ --model_type layoutlm \ --model_name_or_path model \ --do_lower_case \ --max_seq_length 512 \ --do_train \ --num_train_epochs 10.0 \ --logging_steps 10 \ --save_steps -1 \ --output_dir output_method2 \ --labels data_method2/labels.txt \ --per_gpu_train_batch_size 16 \ --per_gpu_eval_batch_size 16 \ --fp16
Step 2) Run Inference/Prediction
python run_seq_labeling.py --do_predict \ --data_dir infer_data \ --model_type layoutlm \ --model_name_or_path output_method2 \ --do_lower_case \ --output_dir pred_m2_no_eval \ --labels data1/labels.txt \ --fp16 Result: Assertion Error
Method 2b: Step 1) Train. Same as Method 2a
Step 2) Run Evaluation
python run_seq_labeling.py --data_dir data_method2 \ --model_type layoutlm \ --model_name_or_path output_method2 \ --do_lower_case \ --do_eval \ --output_dir output_eval2 \ --labels infer_data/labels.txt \ Result: Assertion Error.
Step 3) Inference. Unable to proceed to this step.