Rewrite PSBaseParser and add an optimized in-memory version

The buffering in PSBaseParser (and associated code contortions like the mess that was PDFContentParser) was fragile but also unnecesary when doing file I/O - cPython's BufferedReader implementation is considerably faster, so we can just reimplement the "parser" (really a lexer) as a character-based state machine.

But this actually would lead to an overall slowdown, because in reality, most of the time we aren't parsing PDF data from a buffered file, but from a BytesIO wrapped around an in-memory buffer. In this case, the buffering is redundant but nonetheless faster in practice since it avoids the overhead of calling BytesIO.read repeatedly.

The obvious solution is to create a separate "parser" (really a lexer) using good old regular expressions the way our ancestors intended, and simply use this one when passed an in-memory buffer.

Also means that there is a bit less inheritance abuse in the code, as PSStackParser needs to delegate to the appropriate implementation.

Fixes: #885 and #1025

Also there were some details of the PDF parsing that were incorrect. Most notably, hex strings with odd length are supposed to be padded in big-endian fashion (i.e. <abcde> is equivalent to <abcde0>) but this was not the case in the existing code (which treated this as <abcd0e> instead).

Tested on the usual test suite with nox, profiled with cProfile and time.time.

Checklist

[x] I have read CONTRIBUTING.md.
[x] I have added a concise human-readable description of the change to CHANGELOG.md.
[x] I have tested that this fix is effective or that this feature works.
[x] I have added docstrings to newly created methods and classes.
[x] I have updated the README.md and the readthedocs documentation. Or verified that this is not necessary.

pdfminer / pdfminer.six

Rewrite PSBaseParser and add an optimized in-memory version #1041