microsoft / table-transformer

Table Transformer (TATR) is a deep learning model for extracting tables from unstructured documents (PDFs and images). This is also the official repository for the PubTables-1M dataset and GriTS evaluation metric.
MIT License
2.31k stars 256 forks source link

please give some detailed example about grids evaluation metric #48

Closed franztao closed 2 years ago

franztao commented 2 years ago

the code of calculate grids evaluation metric is inserted in the validation code process, it is KISS, not convinient to re-use the metric.

franztao commented 2 years ago

like this URL:https://github.com/zez188/table_recognizer_zez/blob/c84511ecdffc9d54d095f5d1e844f001e08d0aea/others_code/PubTabNet/src/demo.ipynb

bsmock commented 2 years ago

Thanks for the suggestion, we're excited to see that there is interest in there being easier re-use of the GriTS metrics. We are still actively developing this repository and adding more documentation and support for other models to use GriTS is one of the things in our current roadmap. We plan to include a function to call GriTS with HTML just like in the example you sent. It should be ready very soon.

Cheers, Brandon

bsmock commented 2 years ago

Hi @franztao,

We pushed an update today with a new function grits_from_html(). We'll need to do more testing to make sure it is bug-free but it works on the case in the link you sent. You can use it as follows:

import grits
true_html = "..."
pred_html = "..."
metrics = grits.grits_from_html(true_html, pred_html)
print(metrics)

For the example you linked to, I get the following output:

{
    'grits_top': 1.0,
    'grits_precision_top': 1.0,
    'grits_recall_top': 1.0,
    'grits_top_upper_bound': 1.0,
    'grits_con': 0.9670250896057349,
    'grits_precision_con': 0.9670250896057349,
    'grits_recall_con': 0.9670250896057349,
    'grits_con_upper_bound': 0.9670250896057349
}

So basically GriTS_Top = 1.0 and GriTS_Con = 0.9670.

Hope this helps!

Best, Brandon

franztao commented 2 years ago

the example do not include metric GriTS_Loc (location)? could you list detailed example with picture to describe how to use those metric?

franztao commented 2 years ago

true_html=<html><body><table><thead><tr><eb></eb><td></td><td></td></tr></thead><tbody><tr><td></td><eb></eb><eb></eb></tr><tr><td></td><td></td><td></td></tr><tr><td></td><eb></eb><eb></eb></tr><tr><td></td><eb></eb><eb></eb></tr><tr><td></td><td></td><td></td></tr><tr><td></td><td></td><td></td></tr><tr><td></td><td></td><td></td></tr><tr><td></td><td></td><td></td></tr><tr><td></td><td></td><td></td></tr><tr><td></td><eb></eb><eb></eb></tr><tr><td></td><td></td><td></td></tr><tr><td></td><td></td><td></td></tr><tr><td></td><td></td><td></td></tr><tr><td></td><td></td><td></td></tr><tr><td></td><eb></eb><eb></eb></tr><tr><td></td><eb></eb><eb></eb></tr><tr><td></td><td></td><td></td></tr><tr><td></td><td></td><td></td></tr><tr><td></td><td></td><td></td></tr><tr><td></td><eb></eb><eb></eb></tr><tr><td></td><td></td><td></td></tr><tr><td></td><td></td><td></td></tr><tr><td></td><td></td><td></td></tr><tr><td></td><eb></eb><eb></eb></tr><tr><td></td><td></td><td></td></tr><tr><td></td><td></td><td></td></tr></tbody></table></body></html> pred_html=<html><body><table><thead><tr><eb></eb><td></td><td></td></tr></thead><tbody><tr><td></td><eb></eb><eb></eb></tr><tr><td></td><td></td><td></td></tr><tr><td></td><eb></eb><eb></eb></tr><tr><td></td><eb></eb><eb></eb></tr><tr><td></td><td></td><td></td></tr><tr><td></td><td></td><td></td></tr><tr><td></td><td></td><td></td></tr><tr><td></td><td></td><td></td></tr><tr><td></td><td></td><td></td></tr><tr><td></td><eb></eb><eb></eb></tr><tr><td></td><td></td><td></td></tr><tr><td></td><td></td><td></td></tr><tr><td></td><td></td><td></td></tr><tr><td></td><td></td><td></td></tr><tr><td></td><eb></eb><eb></eb></tr><tr><td></td><eb></eb><eb></eb></tr><tr><td></td><td></td><td></td></tr><tr><td></td><td></td><td></td></tr><tr><td></td><td></td><td></td></tr><tr><td></td><eb></eb><eb></eb></tr><tr><td></td><td></td><td></td></tr><tr><td></td><td></td><td></td></tr><tr><td></td><td></td><td></td></tr><tr><td></td><eb></eb><eb></eb></tr><tr><td></td><td></td><td></td></tr><tr><td></td><td></td><td></td></tr></tbody></table></body></html>

ouput bug as below: image

franztao commented 2 years ago

@bsmock

bsmock commented 2 years ago

Hi @franztao,

Thanks for bringing this to our attention and giving us a chance to discuss this case with you.

Running true_cells = grits.html_to_cells(true_html) produces the following list of cells parsed from the HTML:

[
    {'row_nums': [0], 'column_nums': [0], 'is_column_header': True, 'cell_text': ''},
    {'row_nums': [0], 'column_nums': [1], 'is_column_header': True, 'cell_text': ''},
    {'row_nums': [1], 'column_nums': [0], 'is_column_header': False, 'cell_text': ''},
    {'row_nums': [2], 'column_nums': [0], 'is_column_header': False, 'cell_text': ''},
    {'row_nums': [2], 'column_nums': [1], 'is_column_header': False, 'cell_text': ''},
    {'row_nums': [2], 'column_nums': [2], 'is_column_header': False, 'cell_text': ''}, 
      ...
]

As you can see, the first three rows of the parsed table all have different numbers of columns (or, different numbers of columns that are occupied by a cell). I would say it's ambiguous how to interpret such incomplete HTML as a table. The metric is not designed to handle malformed HTML, so it fails.

For what the metric should do when encountering incomplete/ambiguous HTML, there are a few options we could consider:

  1. Return a value of 0 for the metric
  2. Return an exception (refuse to return a value for the metric)
  3. Skip such a case (do nothing; neither return 0, nor return an exception)
  4. In the case of missing cells, fill in some null value for the missing cells and do the best match possible while ignoring the missing cells
  5. In the case of missing cells, treat the missing cells as implicit empty cells and add them to the table

Do you have a desired behavior for the metric in cases like this?

HTML can be malformed in other ways. In general, I'm not sure it's obvious what the "right" behavior is. If we anticipate certain kinds of malformed HTML, like in your example with missing cells, we could give the user the option to choose how they want the metric to handle it (possibly choosing among the five options above). But this also has some drawbacks.

Best, Brandon

franztao commented 2 years ago

Hello @bsmock, Thanks for your reply, I personaly diffuse tag is formal tag in HTML. The new test sample as below, that can return result withou any exception. 1 represent the cell with text content, but I proccess all text content to string '1'. represent the cell wihout any content.

<html><body><table><thead><tr><td></td><td>1</td><td>1</td></tr></thead><tbody><tr><td>1</td><td></td><td></td></tr><tr><td>1</td><td>1</td><td>1</td></tr><tr><td>1</td><td></td><td></td></tr><tr><td>1</td><td></td><td></td></tr><tr><td>1</td><td>1</td><td>1</td></tr><tr><td>1</td><td>1</td><td>1</td></tr><tr><td>1</td><td>1</td><td>1</td></tr><tr><td>1</td><td>1</td><td>1</td></tr><tr><td>1</td><td>1</td><td>1</td></tr><tr><td>1</td><td></td><td></td></tr><tr><td>1</td><td>1</td><td>1</td></tr><tr><td>1</td><td>1</td><td>1</td></tr><tr><td>1</td><td>1</td><td>1</td></tr><tr><td>1</td><td>1</td><td>1</td></tr><tr><td>1</td><td></td><td></td></tr><tr><td>1</td><td></td><td></td></tr><tr><td>1</td><td>1</td><td>1</td></tr><tr><td>1</td><td>1</td><td>1</td></tr><tr><td>1</td><td>1</td><td>1</td></tr><tr><td>1</td><td></td><td></td></tr><tr><td>1</td><td>1</td><td>1</td></tr><tr><td>1</td><td>1</td><td>1</td></tr><tr><td>1</td><td>1</td><td>1</td></tr><tr><td>1</td><td></td><td></td></tr><tr><td>1</td><td>1</td><td>1</td></tr><tr><td>1</td><td>1</td><td>1</td></tr></tbody></table></body></html>

<html><body><table><thead><tr><td></td><td>1</td><td>1</td></tr></thead><tbody><tr><td>1</td><td></td><td></td></tr><tr><td>1</td><td>1</td><td>1</td></tr><tr><td>1</td><td></td><td></td></tr><tr><td>1</td><td></td><td></td></tr><tr><td>1</td><td>1</td><td>1</td></tr><tr><td>1</td><td>1</td><td>1</td></tr><tr><td>1</td><td>1</td><td>1</td></tr><tr><td>1</td><td>1</td><td>1</td></tr><tr><td>1</td><td>1</td><td>1</td></tr><tr><td>1</td><td></td><td></td></tr><tr><td>1</td><td>1</td><td>1</td></tr><tr><td>1</td><td>1</td><td>1</td></tr><tr><td>1</td><td>1</td><td>1</td></tr><tr><td>1</td><td>1</td><td>1</td></tr><tr><td>1</td><td></td><td></td></tr><tr><td>1</td><td></td><td></td></tr><tr><td>1</td><td>1</td><td>1</td></tr><tr><td>1</td><td>1</td><td>1</td></tr><tr><td>1</td><td>1</td><td>1</td></tr><tr><td>1</td><td></td><td></td></tr><tr><td>1</td><td>1</td><td>1</td></tr><tr><td>1</td><td>1</td><td>1</td></tr><tr><td>1</td><td>1</td><td>1</td></tr><tr><td>1</td><td></td><td></td></tr><tr><td>1</td><td>1</td><td>1</td></tr><tr><td>1</td><td>1</td><td>1</td></tr></tbody></table></body></html>

{'grits_top': 1.0, 'grits_precision_top': 1.0, 'grits_recall_top': 1.0, 'grits_top_upper_bound': 1.0, 'grits_con': 1.0, 'grits_precision_con': 1.0, 'grits_recall_con': 1.0, 'grits_con_upper_bound': 1.0}

put of parse picture of the html badcase in website https://verytoolz.com/html-run.html image