jcushman pdfquery issues

jcushman / pdfquery

A fast and friendly PDF scraping library.

MIT License

772 stars 89 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

TypeError: 'PDFObjRef' object is not subscriptable

#92 sgpinkus opened 5 months ago
0
NEXTPY-569 -- Make pdfquery compatible with Python 3.9 and 3.11

#91 kdleijer opened 1 year ago
0
AttributeError: module 'pdfquery' has no attribute 'PDFQuery'

#90 Alias4D closed 1 year ago
0
Support for password protected pdf files

#89 nvkex opened 1 year ago
0
Removed python2 requirements

#88 panoshalios closed 1 year ago
0
Test compatibility with Python 3.5, 3.6, 3.7

#87 kdleijer closed 3 years ago
0
Python 2 dependency problem: pyquery

#86 cyranix opened 3 years ago
0
Improve performance on large pdfs

#85 OpenGLShaders opened 3 years ago
1
Remove reference to deprecated easy_install.

#84 jaraco closed 10 months ago
0
Is this project still alive?

#83 MartinThoma opened 4 years ago
3
Coordinates to locator

#82 prashantgpt91 opened 4 years ago
0
Not able to detect horizontal lines properly.

#81 cabudies opened 4 years ago
0
recommend you use pdfminer rather than pdfquery

#80 jstofel opened 4 years ago
1
can't concat str to bytes EASY FIX -- please update!

#79 jstofel opened 4 years ago
3
cache collision

#78 patxoca opened 4 years ago
1
Updated code formatting in readme.rst

#77 gcrowder closed 4 years ago
1
Extract all words with their coordinates.

#76 infoankit10 opened 5 years ago
0
windows only: pdfquery is locking the opended pdf-file

#75 iconberg opened 5 years ago
1
loading file with filecache AttributeError: 'NoneType' object has no attribute 'writestr'

#74 ta32 closed 5 years ago
1
PdfQuery | .extract problem

#73 rutgervanheijningen opened 5 years ago
0
Fix extract example in README

#72 jd closed 4 years ago
0
ValueError: All strings must be XML compatible: Unicode or ASCII, no NULL bytes or control characters

#71 vikotse opened 5 years ago
1
Allow to open password protected PDFs

#70 a-w closed 4 years ago
1
Fix range() page numbers for Python3 & prevent long cache file names

#69 chk1 opened 6 years ago
0
Can't concat str to bytes

#68 gtdrakeley opened 6 years ago
3
can load the pages I need

#67 Thug0416 opened 6 years ago
1
How does pdfquery determine the index?

#66 SalmonTT opened 6 years ago
0
Pseudo classes not working

#65 igorjacauna opened 6 years ago
0
Can't get coordinates.

#64 Aleksandern opened 6 years ago
0
Issue #53 fix

#63 jacksongs opened 6 years ago
0
Get pageid of a search object

#62 khoivan88 opened 6 years ago
0
error: invalid command 'bdist_wheel'

#61 raphinesse closed 6 years ago
1
Custom selectors don't support partial functions

#60 draperjames closed 6 years ago
1
pdf.pq( :inbbox) pulling duplicate values

#59 jganzy closed 6 years ago
1
'PDFObjRef' object does not support indexing

#58 travis-st opened 7 years ago
7
438 page PDF takes ~700 sec and ~4GB RAM to load

#57 dsvensson closed 7 years ago
1
Fix TypeError w/r to bytestring on Py3

#56 sergei-maertens closed 7 years ago
1
pdf query not catching some text in page

#55 mayur62662 closed 7 years ago
1
chardet 3.0 seems to have broken something

#54 ses4j closed 7 years ago
2
Error with annotations

#53 adamestein opened 7 years ago
3
PyQuery objects returned by items() have problems

#52 ezk84 closed 7 years ago
2
is there any user manual for this

#51 jinesh777 closed 7 years ago
1
Large Memory Usage

#50 jtsmith2 closed 7 years ago
1
ValueError: Invalid attribute name u'AAPL:AKExtras'

#49 speedplane opened 8 years ago
3
CJK languages supported?

#48 sunalbert opened 8 years ago
1
Installing pdfquery should install pdfminer.six library as a dependency

#47 marcelhekking closed 8 years ago
0
pdf.load() ValueError on pages with unicode

#46 rfire01 closed 8 years ago
0
Added laparams parameter to PDFQuery class

#45 ste-winkler closed 8 years ago
1
error with load() order

#44 pibol closed 8 years ago
2
use two or more consecutive 'in_bbox'

#43 charlessachet closed 8 years ago
1