Open tomoohive opened 3 years ago
In WYGIWYS, the RNN is applied to the CNN output to capture longer spatial dependences. You can apply the RNN to each column of the the CNN output.
Not sure if I did this right but I did a small experiment with WYGIWYS using this implementation. I used format_html
function
available in exploring_PubTabNet_dataset.ipynb and created a label file for few images, I just took few hundered sample images and ran training for 2 epochs and got the results like <html> <th> UNK </th> UNK UNK </html>
where UNK was the token for out of vocab word. Maybe this happend because I used vocab from this PubTabNet dataset which are basically characters but WYGIWYS expects word tokens in the vocab. If you can create a vocab of words and use this function to generate label files with proper spacing, for each image maybe this can work
@nishchay47b When I train WYGIWYS, I used the character level tokenization, where HTML tags are single tokens.
I think that I want to implement WYGIWYS before implementation of EDD. I saw this issue. (https://github.com/ibm-aur-nlp/PubTabNet/issues/6#issuecomment-630506737) I'd like to know more details of WYGIWYS model.
Please tell me where should I add the RNN model to the original tutorial source? Also, what role is that RNN?
I'm thinking that RNN works prediction of structure of the table. Is this understanding of mine correct?