philschmid / document-ai-transformers

MIT License
328 stars 49 forks source link

training/donut_sroie.ipynb fix #13

Open salim-hertelli opened 1 year ago

salim-hertelli commented 1 year ago

I found your code in your blog when searching about donut and I like the notebook you wrote; However, I have some fixes to suggest.

In the run_prediction function does not work properly and always uses the test_sample variable defined above. I suggest the following changes:

    pixel_values = torch.tensor(test_sample["pixel_values"]).unsqueeze(0) 

to

    pixel_values = torch.tensor(sample["pixel_values"]).unsqueeze(0) 

and

    target = processor.token2json(test_sample["target_sequence"]) 

to

    target = processor.token2json(sample["target_sequence"]) 

the same issue occurs in the next code block. change:

    prediction, target = run_prediction(test_sample)

to

    prediction, target = run_prediction(sample)

This makes the result wrong. Because of the problem mentioned above, the function run_prediction will always return the same target and prediction (because it was ignoring the argument sample) and basically the 75% accuracy you got is the accuracy for just one sample (which is the test sample you defined earlier)

salim-hertelli commented 1 year ago

The logic behind the calculation of the accuracy is also wrong. It is possible for the model to predict something more that what is in the target (or the model may also fail to predict some label from the target). This leads to the values not being aligned. example: target: {l1: v1, l2: v2, l3: v3} prediction: {l1: v1', l3, v3'} eventually you will be comparing v3 and v3' which are not necessarily comparable