Open k920049 opened 10 months ago
Right now this is just a copy of the original dataset.
But soon we will update the test and val splits to version 1.1. This version is what is used in the paper "Aligning benchmark datasets for table structure recognition".
In v1.1, the cropped table images have 2 pixels of padding around the table border. In the original dataset (v1.0), these images have ~30 pixels of padding.
The training data/split is the same for v1.0 and v1.1. In other words, the training data still comes with ~30 pixels of padding around the cropped tables.
Hope that helps!
Cheers, Brandon
Hello,
It seems like there is an alternative download page of pubtables-1m on huggingface. Did you applied the canonicalization and consistency adjustment mentioned in the paper, "aligning benchmark datasets for table structure recognition"? Or is it just a copy of the original dataset?