ZeningLin / ViBERTgrid-PyTorch

An unofficial PyTorch implementation of "Lin et al. ViBERTgrid: A Jointly Trained Multi-Modal 2D Document Representation for Key Information Extraction from Documents. ICDAR, 2021"
52 stars 5 forks source link

Hi,could you share example configs of funsd dataset ? #11

Closed llf10811020205 closed 2 years ago

ZeningLin commented 2 years ago

已私发邮箱,请查收

llf10811020205 commented 2 years ago

已私发邮箱,请查收

Thanks!

r121196 commented 1 year ago

can you please share the example config for funsd dataset?

ZeningLin commented 1 year ago

comment: " FUNSD resnet-34-pretrained "
############################################

device: 'cuda'
syncBN: True

start_epoch: 0
end_epoch: 33
batch_size: 2

optimizer_cnn_hyp: # SGD -> CNN
 learning_rate: 0.008
 min_learning_rate: 0.00001
 warm_up_epoches: 1
 warm_up_init_lr: 0.00001
 momentum: 0.9
 weight_decay: 0.005
 min_weight_decay: 0.005

optimizer_bert_hyp: # AdamW -> BERT
 learning_rate: 0.000005
 min_learning_rate: 0.0000001
 warm_up_epoches: 1
 warm_up_init_lr: 0.0000001
 beta1: 0.9
 beta2: 0.999
 epsilon: 0.00000001
 weight_decay: 0.01
 min_weight_decay: 0.01

#############################################
num_hard_positive_main_1: 16
num_hard_negative_main_1: 16
num_hard_positive_main_2: 32
num_hard_negative_main_2: 32
loss_aux_sample_list:
 - 256
 - 512
 - 256
num_hard_positive_aux: 256
num_hard_negative_aux: 256
ohem_random: True
#############################################

##################################
classifier_mode: "simp"
eval_mode: "seqeval"
tag_mode: "B"
#################################

save_top: "./weights/"
save_log: "./log/"

amp: True
weights: ''
num_workers: 0
#########################################################
data_root: "/dir/to/FUNSD"
#########################################################

# FUNSD
# #######################################
num_classes: 4
########################################
image_mean:
 - 0.9480
 - 0.9480
 - 0.9480
image_std:
 - 0.1840
 - 0.1840
 - 0.1840

image_min_size:
 - 320
 - 416
 - 512
 - 608
 - 704

image_max_size: 800
test_image_min_size: 512

#########################################################
bert_version: "bert-base-uncased"  # FUNSD
backbone: "resnet_34_fpn_pretrained"
########################################################
grid_mode: "mean"
early_fusion_downsampling_ratio: 8
roi_shape: 7
p_fuse_downsampling_ratio: 4
roi_align_output_reshape: False
late_fusion_fuse_embedding_channel: 1024
layer_mode: "single"
add_pos_neg: True

###########################
loss_weights: 
# - 1
# - 1
# - 1.5
# - 1
# - 1.5

# - 0
# - 4.906
# - 5.372
# - 2.002
# - 5.373

loss_control_lambda: 1
r121196 commented 1 year ago

Thank you! Do you have any pretrained model with FUNSD dataset?

ZeningLin commented 1 year ago

I'm sorry that I can't get the pre-trained model on FUNSD for you. The weights are stored on a server that I don't have access to currently due to position changes. You can train the model based on the configuration mentioned above, and it won't take up a long time.