#501 Added the Feature of PDFQueryParser along with componnent file and test file.

Rexon-Pambujya commented 1 week ago

Link to Issue: #501

PDFQueryParser.py

Using PDFQuery, extract text from PDF files

[x] pkgs\community\swarmauri_community\parsers\concrete\PDFQueryParser.py This file contains the implementation of the class PDFQueryParser which is used to parse PDF documents. It includes features for reading text content from PDF files.
[x] pkgs\community\tests\unit\parsers\PDFQueryParser_test.py This file is dedicated to the unit testing of the PDFQueryParser class. It ensures the parsing functionalities work correctly by validating various input types and verifying the extracted text's accuracy.

I kindly ask the maintainers to review my code and point out any mistakes. Thank you!

cobycloud commented 3 days ago

cobycloud commented 3 days ago

it does not have any dependencies

cobycloud commented 3 days ago

cobycloud commented 3 days ago

pdfquery is more outdated that pdfminer appears to be. additionally, pdfminer appears to have the biggest following

cobycloud commented 3 days ago

pdfquery appears to be more lightweight that pdfminer, in terms of file count and complexity of composition