Closed julian24bas closed 1 month ago
i get the same output as well with the fintabnet model, with some encoding errors sometimes as shown below, did you find any solutions to that or this just what the model gives and there is nothing that can be done about it
Traceback (most recent call last):
File "demo.py", line 412, in <module>
f.write(pred_html + '\n')
File "C:\Users\amine\anaconda3\envs\myenv\lib\encodings\cp1252.py", line 19, in encode
return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u0301' in position 338: character maps to <undefined>
If I remember correctly changing the config argument in line 339 of the demo.py
script to ./configs/textrecog/master/table_master_ResnetExtract_Ranger_0705_FinTabNet_cell150_batch4.py
helped getting rid of the error. The result was still worse than with the PubTabNet model.
I also test and the result is not good. @namtuanly could you please let us know?
Hi all, The demo file for FinTabNet is different from PubtabNet. I have committed the demo file for FinTabNet (demo_FinTabNet.py). Could you try it again?
Hi, thanks for your effort!
I just tried and got two smaller errors:
File "/MTL-TabNet/table_recognition/demo/demo_FinTabNet.py", line 343, in <module>
tablemaster_inference.print_num_params()
AttributeError: 'Structure_Recognition' object has no attribute 'print_num_params'
I just removed that line as it seems to be for debugging
deal_bb()
function was missing an argument. I figured this might fix it?
def deal_bb(result_token, search_token):
"""
In our opinion, <b></b> always occurs in <thead></thead> text's context.
This function will find out all tokens in <thead></thead> and insert <b></b> by manual.
:param result_token:
:return:
"""
# find out <thead></thead> parts.
thead_pattern = f'<{search_token}>(.*?)</{search_token}>'
The result for the sample image are better now:
Hello, @julian24bas .
Unfortunately, you image is not visible.
I've got the following image using FinTabNet/epoch_17.pth:
@namtuanly, Thank you for your work! We need vertical flip for FinTabNet dataset, because generated labels are vertically flipped.
I used your demo script with the pretrained PubTabNet model with no issues. However using the FinTabNet model I get following output: