Open avalon1610 opened 1 year ago
Hey, thanks! Do you want to send a PR or should I do it?
Also: are you using this for anything? Do you think this need a better API or updating poppler, or it's good as is?
I'm sorry that my working environment is not convenient for submitting PR.
In fact, I encountered several problems unexpectedly when using this library on Windows to search for specific text in pdf, I would like to share my experience to see if it will help to improve this project:
static-poppler
] to build poppler on windows is hard, it need serveral other dependencies. So I fallback to use pre-build poppler. It's nicer to include all the dependencies into this project.[env]
PKG_CONFIG_PATH = {value = "poppler/lib/pkgconfig", relative = true}
/std:c++17
into build.rs
when building. (I'm using VS2022 17.4.3)
let callpoppler = build.flag("/std:c++17").cpp(true).file("src/callpoppler.cc");
pdftotext_print_with_layout
. Actually I modify several places, changed all heap malloc to local variables and add page->decRefCnt()
GooString inputPdf(filename);
std::unique_ptr<PDFDoc> doc = PDFDocFactory().createPDFDoc(inputPDF, {}, {});
if (!doc->isOk()) {
return CouldntReadPdf;
}
int lastPage = doc->getNumPages(); for (int pageNum = 1; pageNum <= lastPage; pageNum++) { newpage_f(stream, pageNum); TextOutputDev textOut(nullptr, tree, 0.0, false, false, false); if (!textOut.isOk()) { return CouldntOutput; } textOut.setTextEOL(eolUnix); doc->displayPage(&textOut, pageNum, 72.0, 72.0, 0, true, false, false); TextPage *page = textOut.takeText(); page->dump(stream, output_f, true, eolUnix, false); page->decRefCnt(); }
return NoError;
- add return code check in `pdftotext_layout` in lib.rs
```rust
let code = unsafe { pdftotext_print_with_layout( ... ) };
match code {
ResultCode::NoError => Ok(vec),
ResultCode::InternalError => Err(Error::InternalError),
ResultCode::CouldntReadPdf => Err(Error::CouldntReadPdf),
ResultCode::CouldntOutput => Err(Error::CouldntOutput),
}
Oh.. now that's interesting! I've never thought this could possibly run on Windows, but it's nice to see that with some tweaks it does!
I think for convenience it should bundle not only poppler but all its dependencies; building on Windows should just work on either mingw and msvc. And maybe bundle poppler's official prebuild libraries too, under a different feature flag.
I'm sorry that my working environment is not convenient for submitting PR.
Do you mean that your employer haven't cleared your code for submitting it upstream? Or it's more like, it's hard to disentangle those fixes from other commits unrelated to this?
Unfortunately it's a bit hard to me to make this PR because I don't run Windows. (also I don't use this anymore 😅) but Windows aside, that's quite a few bugfixes, thanks!
takeText()
take the ownership of page, so we need callpage->decRefCnt()
afterpage->dump()