Closed Akhilesh64 closed 6 months ago
@NielsRogge I was able to run training for information extraction following similar training regime as pix2struct. The only problem I am facing is that the GPU goes OOM quite often. I was trying to take advantage of multi GPU setup as the larger model doesn't fit on a single GPU. I was hoping you could prepare a script for training using huggingface trainer instead of pytorch lightning which gives more flexibility and ease of use for such scenarios.
Hi,
Just uploaded a new notebook here: https://github.com/NielsRogge/Transformers-Tutorials/blob/master/UDOP/Fine_tune_UDOP_on_a_custom_dataset_(JSON_extraction).ipynb. Works for me in Google Colab (I'm using a T4 GPU with high RAM setting).
Hey @NielsRogge
Just saw the tutorial for UDOP on RVL-CDIP dataset. It looked great. I was wondering if you could prepare a same tutorial on information extraction from documents in structured format like JSON for the key fields as you mentioned in the UDOP tutorial. It would be a huge help.