Layout-Parser / layout-parser

A Unified Toolkit for Deep Learning Based Document Image Analysis
https://layout-parser.github.io/
Apache License 2.0
4.67k stars 449 forks source link

Fine tuning on Custom Dataset while using the pre-trained weights (with different classes than the original model) #151

Open deshwalmahesh opened 1 year ago

deshwalmahesh commented 1 year ago

Before someone sends me to the model training repo, please let me just explain.

I want to fine tune the existing model, say PubLayNet/faster_rcnn_R_50_FPN_3x model for my own task BUT for a Single Class, ex: text Detection only where the mapping is as {0: "Text", 1: "Title", 2: "List", 3:"Table", 4:"Figure"}

Or maybe on HJDataset where classes are {1:"Page Frame", 2:"Row", 3:"Title Region", 4:"Text Region", 5:"Title", 6:"Subtitle", 7:"Other"}.

I found this Kaggle Notebook on fine tuning with Detectron2 for fine tuning but the problem is what I have described earlier that I just want to train on 1 class.

What would be the changes that I'll have to do? How would the things_classes look like?

thing_classes= ['text'] # cfg.MODEL.ROI_HEADS.NUM_CLASSES = 1

thing_classes= ["None",'text'] # cfg.MODEL.ROI_HEADS.NUM_CLASSES = 2

thing_classes= ['text', 'None', 'None', 'None', 'None'] # cfg.MODEL.ROI_HEADS.NUM_CLASSES = 5

thing_classes= ['None', 'text', 'None', 'None', 'None', 'None'] #  cfg.MODEL.ROI_HEADS.NUM_CLASSES = 6

thing_classes= ['text', 'None', 'None', 'None', 'None', 'None'] #  cfg.MODEL.ROI_HEADS.NUM_CLASSES = 6
  1. Would it be any different if I use Layout Parser Model Config for faster_rcnn_R_50_FPN_3x instead of the default one from Detectron2/configs/COCO-Detection/faster_rcnn_R_50_FPN_3x.yaml

Thanks in advance.

AmmarNassan commented 1 year ago

Use : thing_classes= ["None",'text'] # cfg.MODEL.ROI_HEADS.NUM_CLASSES = 2

for more details https://layout-parser.readthedocs.io/en/latest/example/deep_layout_parsing/index.html

Prakhar2295 commented 5 months ago

Hello Sir I have been trying to train this layout parser using your kaggle notebook and I want to fine tune it only for the table , and as per you Screenshot (2101) r answer I tried using this format [None,Table],but it is showing zero images in table and 26 in None,also if I train only Table bank model which is for table only can ,do I still have to uses this format and also can you tell me from where can download the weights for table ![Uploading Screenshot (2100).png…]() ![Uploading Screenshot (2101).png…]() ![Uploading Screenshot (2100).png…]() bank faster rcnn table bank and is there any link that you can provide so that I can further go deep inside this.Thanks you very much in advance.

Prakhar2295 commented 5 months ago

![Uploading Screenshot (2093).png…]()

Prakhar2295 commented 5 months ago

Screenshot (2094) Screenshot (2095) Screenshot (2096) Screenshot (2097) Screenshot (2098) Screenshot (2099) Screenshot (2100) Screenshot (2101) Screenshot (2102)