Closed lolipopshock closed 2 years ago
The current batching function is tested via:
import vila
import layoutparser as lp # For visualization
from vila.pdftools.pdf_extractor import PDFExtractor
from vila.predictors import HierarchicalPDFPredictor, LayoutIndicatorPDFPredictor
pdf_extractor = PDFExtractor("pdfplumber")
page_tokens, page_images = pdf_extractor.load_tokens_and_image("test.pdf")
vision_model = lp.EfficientDetLayoutModel("lp://PubLayNet")
pdf_predictor = LayoutIndicatorPDFPredictor.from_pretrained("allenai/ivila-block-layoutlm-finetuned-docbank")
for idx, page_token in enumerate(page_tokens):
blocks = vision_model.detect(page_images[idx])
page_token.annotate(blocks=blocks)
pdf_data = page_token.to_pagedata().to_dict()
predicted_tokens = pdf_predictor.predict(pdf_data, page_token.page_size)
predicted_tokens2 = pdf_predictor.predict_page(pdf_data, page_token.page_size, 1)
assert predicted_tokens == predicted_tokens2
This PR introduces a new function in the VILA predictors
predict_page
that allows setting the maximum batch size for running the model. This can be used to control the memory usage when using the vial models.