jrmuizel / pdf-extract

A rust library for extracting content from pdfs
364 stars 73 forks source link

Text result split by spacing #84

Closed frankvgompel closed 3 months ago

frankvgompel commented 3 months ago

This is the same as #79 so I don't know why that was closed. I used the same pdf ("2005_BBC_strike.pdf") that was posted in issue79

`tha t the B B C m is re pre s e nte d the gove rnm e nt's pos ition on the w a r a nd e nga ge d in s loppy re porting.

D e s pite c onta ining va lid c ritic is m of the B B C , the re port w a s vie w e d by othe r s e c tions of the m e dia a s a n

" e s ta blis hm e nt w hite w a s h" [4 ]

frankvgompel commented 3 months ago

Sorry, didn't realize you hadn't pushed that version to crates yes. Works correctly when I add the repo through git.

jrmuizel commented 3 months ago

I've now published v0.7.5 with these changes.