herobd / FUDGE

Code for the ICDAR2021 paper "Visual FUDGE: Form Understanding via Dynamic Graph Editing"
GNU General Public License v3.0
33 stars 6 forks source link

Linking for distant words #4

Open harshsummit opened 1 year ago

harshsummit commented 1 year ago

Hey @herobd , I was trying to extract the relations for keywords using the pretrained weights and your code.

But for distant boxes it doesnt seem to identify the linking, is there anyway to solve this? ec31a33a-7c6a-4dda-b0ab-e86c5f7df390

herobd commented 1 year ago

It could either be an artifact of the pretraining data (not having long relationships) or the Swin model having windowed attention. Have you tried fine-tuning on your data?

harshsummit commented 1 year ago

I couldn't figure out how should the structure of my dataset look like, can you please help me with that? @herobd

herobd commented 1 year ago

Sorry, ignore my prior response about the Swin model attention (I thought this was an issue on Dessurt). The graph should have links that far across, but it's failing to merge the two parts of the key together (e.g. "9 Add lines..." and "9"). Fine tuning is probably a good thing to try still.

You have a few choices with the data:

  1. Make it look like the NAF or FUNDS data and use one of those dataset loaders
  2. Write your own child class of datasets/graph_pair.py. This is mostly writing the parseAnn function.

Do you have annotations for your data?

harshsummit commented 1 year ago

Yea I have the annotations for my dataset, but Im unable to finetune it for my dataset.

I even tried to fine tune it for FUNSD dataset by extracting the FUNSD.zip dataset provided in the Readme link, and place it inside “data” folder.

But it throws me an error that value of num_classes should be more than 0 currently is 0

harshsummit commented 1 year ago

Is the dataset structure to be modified before using it for training? or we can use the FUNSD directly to train it using train.py

herobd commented 1 year ago

What is the config your using? And what is the exact error? (line number)