Closed estebance closed 1 year ago
Hi, just want to clarify what is the purpose of this block
for row in pdf: if len(row['text']) < 30: continue filtered_pdf.append(row)
Why the criteria is 30 characters ?
I'd like to contribute to the project, but first I need to understand a little bit about the implementation
Hi sorry for the late response, the 30 characters is to ignore subheadings and captions on images and other tiny pieces of text that may not be relevant
Hi, just want to clarify what is the purpose of this block
Why the criteria is 30 characters ?
I'd like to contribute to the project, but first I need to understand a little bit about the implementation