Open isarker opened 1 year ago
Hi,
First let me say I understand the ask for better documentation for a broader audience. This repository has been intended mostly for other ML researchers, to allow others to reproduce our research. We rely on our research papers to be a primary source of documentation and assume our users will have read them.
The error you're seeing in main.py is because the train folder is empty. There need to be annotations in XML format in that folder.
The performance you're seeing at inference is due to two things.
Hope this is helpful.
Best, Brandon
Hello all:
I have been trying to use the Microsoft Table Transformer (as exists in Github) to detect and extract (Tables and Cells) in TIFF files along with the text that exists inside the cells. For that, we have used a machine that has the following config:
Using the Docker (and through using a Dockerfile, etc), we have created a container where we have setup the Table Transformer. When we run the Inference.py program, we get cells. spanning cells, tables detected BUT many of them are erroneous. Sharing an input file and detected cells here.
Then we explored the need to train the model for the types of tables that we shall be encountering. For that we started using the main.py with the following command.
We are encountering the following error.
We have run out of ideas how to get the main.py rogram to work. Following is the directory structure that we have:
If someone can guide us as to how to either get the inference.py to recognize more accurately OR main.py to train using the directory structure that I have mentioned, we shall be grateful.
Also, I would like to mention that if the documentation were to become simpler and richer, it will be helpful to a vastly larger number of people. (This is a suggestion). The current documentaion (as in to do steps for errors encountered) is possibly inadequate for people like me ( 4 weeks back, I was a ZERO at Python programming, AI, Github , etc. :) )