#need to figure out a way to convert pdf to txt before grep

Try using PDFGREP - I was able to convert the schema PDF to a fairly structured format.

From there you can potentially use grep's contextual operators "-A, -B" to include n lines before or after a pattern match.

Here are my results on a simple pdfgrep command

pdfgrep " " schema_alphabetic.pdf | uniq | more
State of California
Civil Service Pay Scale - Alpha by Class Title
  Schem Class
          Code   Full Class Title
                           Compensation              SISA Footnotes         AR Crit  MCR Prob. Mo. WWG NT   CBID
  CU70     1733  ACCOUNT CLERK II
                      $2,471.00 - $3,097.00           SISA                             1        6   2       R 04
  ME10     4915  ACCOUNT MANAGER, CALIFORNIA EXPOSITION AND STATE FAIR
                      $5,553.00 - $6,901.00                01 43                       1       12   E       S 01
  JL32     4177  ACCOUNTANT I (SPECIALIST)
                 A    $3,000.00 - $3,757.00                                285         1        6   2       R 01
                 L    $3,000.00 - $3,757.00                                285         1        6   2       R 01

josephlei / ca-jobs-schema

#need to figure out a way to convert pdf to txt before grep #3