ibm-aur-nlp / PubLayNet

Other
900 stars 165 forks source link

ambiguous with labels mapping #42

Open SAIVENKATARAJU opened 2 years ago

SAIVENKATARAJU commented 2 years ago

Hi, I am currently fine tuning layout parser on my custom dataset. I am using pubLayNet/faster_rcnn_R_50_FPN_3x as my base model but according to this model output label set is something like this. {0: "Text", 1: "Title", 2: "List", 3: "Table", 4: "Figure"}. but in my original PDF I just want to use "Title", "Section", "Paragraph", "ListItem", PageNumber""Table". my question is : what should be the order of the label mapping. Also, with use of pre-trained model its pretty much detecting tables in customdata and i Just dont want to ruin it. can you please suggest me how should I Move along?.

opyate commented 1 year ago

See this doc - it has instructions for custom label mappings.