DocQuery has difficulty pulling concepts from different parts of a document

This is an excellent example. The current models that DocQuery comes with (both LayoutLM and Donut) are designed specifically for answering "short" questions that are assumed to be consecutive. I think to be good at a task like this, we'd need to train the models quite differently.

Could you share more details about the use case? I can either recommend some other models to look into (e.g. using NER to classify all of the animals mentioned in your documents) or we can keep this on the backburner as something to look at on the modeling side.

impira / docquery

DocQuery has difficulty pulling concepts from different parts of a document #24