Open Renalyn-pan opened 1 year ago
use the image caption scripts ,set task as ocr
use the image caption scripts ,set task as ocr
I rewrite the script as follows:
export MASTER_PORT=1081
export CUDA_VISIBLE_DEVICES=0
export GPUS_PER_NODE=1
user_dir= ../../ofa_module
bpe_dir=../../utils/BERT_CN_dict
data= ../../dataset/test/data.mdb
path= ../../weights/caption_cn_large.pt
result_path= ../../result/ocr
selected_cols=1,4,2
split='test'
python -m torch.distributed.launch --nproc_per_node=${GPUS_PER_NODE} --master_port=${MASTER_PORT} ../../evaluate.py
${data}
--path=${path}
--user-dir=${user_dir}
--task=ocr
--batch-size=16
--log-format=simple
--log-interval=10
--seed=7
--gen-subset=${split}
--results-path=${result_path}
--beam=8
--max-len-b=128
--no-repeat-ngram-size=3
--num-workers=0
--model-overrides="{"data":"${data}","bpe_dir":"${bpe_dir}","eval_cider":False,"selected_cols":"${selected_cols}"}"
the error is evaluate.py: error: the following arguments are required: data` It cannot read the data of BCTR dataset (lmdb). Can I refer to your script?
you should change the data format to tsv. refer to lmdbreader.py
发自我的iPhone
------------------ Original ------------------ From: Renalyn @.> Date: Sat,Feb 18,2023 8:53 PM To: OFA-Sys/OFA @.> Cc: Zihan @.>, Comment @.> Subject: Re: [OFA-Sys/OFA] About finetunig on OFA-OCR (Issue #350)
I rewrite the script as follows: `export MASTER_PORT=1081 export CUDA_VISIBLE_DEVICES=0 export GPUS_PER_NODE=1
user_dir= ../../ofa_module bpe_dir=../../utils/BERT_CN_dict
data= ../../dataset/test/data.mdb path= ../../weights/caption_cn_large.pt result_path= ../../result/ocr selected_cols=1,4,2 split='test'
python -m torch.distributed.launch --nproc_per_node=${GPUS_PER_NODE} --master_port=${MASTER_PORT} ../../evaluate.py ${data} --path=${path} --user-dir=${user_dir} --task=ocr --batch-size=16 --log-format=simple --log-interval=10 --seed=7 --gen-subset=${split} --results-path=${result_path} --beam=8 --max-len-b=128 --no-repeat-ngram-size=3 --num-workers=0 --model-overrides="{"data":"${data}","bpe_dir":"${bpe_dir}","eval_cider":False,"selected_cols":"${selected_cols}"}"the error isevaluate.py: error: the following arguments are required: data` It cannot read the data of BCTR dataset (lmdb). Can I refer to your script?
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>
you should change the data format to tsv. refer to lmdbreader.py 发自我的iPhone … ------------------ Original ------------------ From: Renalyn @.> Date: Sat,Feb 18,2023 8:53 PM To: OFA-Sys/OFA @.> Cc: Zihan @.>, Comment @.> Subject: Re: [OFA-Sys/OFA] About finetunig on OFA-OCR (Issue #350) I rewrite the script as follows:
export MASTER_PORT=1081 export CUDA_VISIBLE_DEVICES=0 export GPUS_PER_NODE=1 user_dir= ../../ofa_module bpe_dir=../../utils/BERT_CN_dict data= ../../dataset/test/data.mdb path= ../../weights/caption_cn_large.pt result_path= ../../result/ocr selected_cols=1,4,2 split='test' python -m torch.distributed.launch --nproc_per_node=${GPUS_PER_NODE} --master_port=${MASTER_PORT} ../../evaluate.py ${data} --path=${path} --user-dir=${user_dir} --task=ocr --batch-size=16 --log-format=simple --log-interval=10 --seed=7 --gen-subset=${split} --results-path=${result_path} --beam=8 --max-len-b=128 --no-repeat-ngram-size=3 --num-workers=0 --model-overrides="{"data":"${data}","bpe_dir":"${bpe_dir}","eval_cider":False,"selected_cols":"${selected_cols}"}"the error isevaluate.py: error: the following arguments are required: data
It cannot read the data of BCTR dataset (lmdb). Can I refer to your script? — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>
what is the format? like img_key \t imgbuf \t label_key \t label I find the pretrain dataset MUGE use the base64, should i use this coding method?
you should change the data format to tsv. refer to lmdbreader.py 发自我的iPhone
I try the tsv format,the error is still:the error is evaluate.py: error: the following arguments are required: data
you should change the data format to tsv. refer to lmdbreader.py 发自我的iPhone
I try the tsv format,the error is still:the error is evaluate.py: error: the following arguments are required: data
Hello!How can I convert my images to the tsv format ? Thank you very much !
Hi, Thank you for releasing this great project!! The performance of demo is so suprising! I want to finetune on my own OCR datasets. But i didn't find any run_scripts about finetuning or inference like other downstream tasks. Will you provide a tutorial on this in the future? I'm very eager for it. Thanks again!