jrmuizel / pdf-extract

A rust library for extracting content from pdfs
364 stars 73 forks source link

FR: Make the HTML output buffer string available #78

Closed annie444 closed 5 months ago

annie444 commented 6 months ago

I'm using this in a text editing app using Tauri, and I need access to the raw HTML string.

jrmuizel commented 5 months ago

Can you elaborate on what you're looking for?

annie444 commented 5 months ago

Closing this issue because the method new on the HTMLOutput struct takes any implementation of the Write trait, which easily allows one to capture the output as a String (if you so desire).

I was confused by the argument being named file but that's neither here nor there. Just my own stupidity.

For anyone in the future who wants to convert a PDF to an HTML string to send to a front end, this issue should answer that question for you. Specifically, you can use a Vec to ingest the bytes and then use the std::str::from_utf8() function with the Vec.as_slice() method to get a Sized string output.

More info here on stackoverflow