mitre / inspec_tools

A command-line and ruby API of utilities, converters and tools for creating, converting and processing security baseline formats, results and data
https://inspec-tools.mitre.org/
Other
91 stars 30 forks source link

pdf2inspec fails to parse pdf from CIS #189

Closed skewled closed 2 years ago

skewled commented 4 years ago

Using latest version of inspec_tools and it fails to parse with the below error. CIS_Amazon_Web_Services_Three-tier_Web_Architecture_Benchmark_v1.0.0.pdf

debug_text.txt pdf_text.txt

Couldn't upload the binary format extension so it has the .txt added intentionally.

inspec_tools pdf2inspec -p CIS_Amazon_Web_Services_Three-tier_Web_Architecture_Benchmark_v1.0.0.pdf -o . -f ruby -s true -d
Expected at least 1 of CONTROL at line 1 char 1.
`- Failed to match sequence (HEADER APPLICABILITY DESCRIPTION? RATIONALE? AUDIT? REMEDIATION? IMPACT? DEFAULT_VALUE? REFERENCES? CIS_CONTROLS?) at line 3 char 1.
   `- Failed to match sequence ('Profile Applicability:' NEWLINE? applicability:(line:((ATTRIBUTE_ABSENT .){1, }){0, })) at line 3 char 1.
      `- Expected "Profile Applicability:", but got "......................" at line 3 char 1.
Traceback (most recent call last):
    10: from /usr/local/bin/inspec_tools:23:in `<main>'
     9: from /usr/local/bin/inspec_tools:23:in `load'
     8: from /Library/Ruby/Gems/2.6.0/gems/inspec_tools-2.0.3/exe/inspec_tools:14:in `<top (required)>'
     7: from /Library/Ruby/Gems/2.6.0/gems/inspec-core-4.19.0/lib/inspec/base_cli.rb:35:in `start'
     6: from /Library/Ruby/Gems/2.6.0/gems/thor-1.0.1/lib/thor/base.rb:485:in `start'
     5: from /Library/Ruby/Gems/2.6.0/gems/thor-1.0.1/lib/thor.rb:392:in `dispatch'
     4: from /Library/Ruby/Gems/2.6.0/gems/thor-1.0.1/lib/thor/invocation.rb:127:in `invoke_command'
     3: from /Library/Ruby/Gems/2.6.0/gems/thor-1.0.1/lib/thor/command.rb:27:in `run'
     2: from /Library/Ruby/Gems/2.6.0/gems/inspec_tools-2.0.3/lib/inspec_tools/plugin_cli.rb:133:in `pdf2inspec'
     1: from /Library/Ruby/Gems/2.6.0/gems/inspec_tools-2.0.3/lib/inspec_tools/pdf.rb:38:in `to_inspec'
/Library/Ruby/Gems/2.6.0/gems/inspec_tools-2.0.3/lib/inspec_tools/pdf.rb:60:in `parse_controls': undefined method `each' for nil:NilClass (NoMethodError)
aaronlippold commented 4 years ago

We will take a look at this. As a workaround you can use xlsx2inspec and or csv2inspec if you can get the excel spreadsheet version and you should be able to move forward.

Bialogs commented 4 years ago

@skewled Are you using the docker image?

Atharex commented 2 years ago

Are there any ongoing updates here? Trying both my own installation of the 3.1.0 gem and the docker mitre/inspec_tools:latest and getting the same error for both:

>>> inspec_tools pdf2inspec -p /opt/CIS_CentOS_Linux_7_Benchmark_v3.1.2.pdf -o test

Expected at least 1 of CONTROL at line 1 char 1.
`- Failed to match sequence (HEADER APPLICABILITY DESCRIPTION? RATIONALE? AUDIT? REMEDIATION? IMPACT? DEFAULT_VALUE? REFERENCES? CIS_CONTROLS?) at line 1 char 1.
   `- Failed to match sequence (NEWLINE? SPACES? header:(section_num:SECTION_NUM title:TITLE score:SCORE) NEWLINE) at line 1 char 1.
      `- Failed to match sequence (section_num:SECTION_NUM title:TITLE score:SCORE) at line 12598 char 1.
         `- Failed to match sequence (LPARN SCORED RPARN) at line 12598 char 1.
            `- Premature end of input at line 12598 char 1.
Traceback (most recent call last):
        10: from /usr/local/bin/inspec_tools:23:in `<main>'
         9: from /usr/local/bin/inspec_tools:23:in `load'
         8: from /usr/local/lib/ruby/gems/2.7.0/gems/inspec_tools-3.1.0/exe/inspec_tools:14:in `<top (required)>'
         7: from /usr/local/lib/ruby/gems/2.7.0/gems/inspec-core-4.46.13/lib/inspec/base_cli.rb:35:in `start'
         6: from /usr/local/lib/ruby/gems/2.7.0/gems/thor-1.1.0/lib/thor/base.rb:485:in `start'
         5: from /usr/local/lib/ruby/gems/2.7.0/gems/thor-1.1.0/lib/thor.rb:392:in `dispatch'
         4: from /usr/local/lib/ruby/gems/2.7.0/gems/thor-1.1.0/lib/thor/invocation.rb:127:in `invoke_command'
         3: from /usr/local/lib/ruby/gems/2.7.0/gems/thor-1.1.0/lib/thor/command.rb:27:in `run'
         2: from /usr/local/lib/ruby/gems/2.7.0/gems/inspec_tools-3.1.0/lib/inspec_tools/plugin_cli.rb:140:in `pdf2inspec'
         1: from /usr/local/lib/ruby/gems/2.7.0/gems/inspec_tools-3.1.0/lib/inspec_tools/pdf.rb:33:in `to_inspec'
/usr/local/lib/ruby/gems/2.7.0/gems/inspec_tools-3.1.0/lib/inspec_tools/pdf.rb:55:in `parse_controls': undefined method `each' for nil:NilClass (NoMethodError)
docker run -it --rm -v$(pwd):/share mitre/inspec_tools pdf2inspec -p /opt/CIS_CentOS_Linux_7_Benchmark_v3.1.2.pdf -o test
Expected at least 1 of CONTROL at line 1 char 1.
`- Failed to match sequence (HEADER APPLICABILITY DESCRIPTION? RATIONALE? AUDIT? REMEDIATION? IMPACT? DEFAULT_VALUE? REFERENCES? CIS_CONTROLS?) at line 1 char 1.
   `- Failed to match sequence (NEWLINE? SPACES? header:(section_num:SECTION_NUM title:TITLE score:SCORE) NEWLINE) at line 2 char 1.
      `- Failed to match sequence (section_num:SECTION_NUM title:TITLE score:SCORE) at line 8727 char 1.
         `- Failed to match sequence (LPARN SCORED RPARN) at line 8727 char 1.
            `- Premature end of input at line 8727 char 1.
Traceback (most recent call last):
        10: from /usr/local/bundle/bin/inspec_tools:23:in `<main>'
         9: from /usr/local/bundle/bin/inspec_tools:23:in `load'
         8: from /usr/local/bundle/gems/inspec_tools-3.1.0.pre1/exe/inspec_tools:14:in `<top (required)>'
         7: from /usr/local/bundle/gems/inspec-core-4.38.9/lib/inspec/base_cli.rb:35:in `start'
         6: from /usr/local/bundle/gems/thor-1.1.0/lib/thor/base.rb:485:in `start'
         5: from /usr/local/bundle/gems/thor-1.1.0/lib/thor.rb:392:in `dispatch'
         4: from /usr/local/bundle/gems/thor-1.1.0/lib/thor/invocation.rb:127:in `invoke_command'
         3: from /usr/local/bundle/gems/thor-1.1.0/lib/thor/command.rb:27:in `run'
         2: from /usr/local/bundle/gems/inspec_tools-3.1.0.pre1/lib/inspec_tools/plugin_cli.rb:140:in `pdf2inspec'
         1: from /usr/local/bundle/gems/inspec_tools-3.1.0.pre1/lib/inspec_tools/pdf.rb:33:in `to_inspec'
/usr/local/bundle/gems/inspec_tools-3.1.0.pre1/lib/inspec_tools/pdf.rb:55:in `parse_controls': undefined method `each' for nil:NilClass (NoMethodError)
robocop-bob commented 2 years ago

Similar issue with CIS_Apple_macOS_12.0_Monterey_Benchmark_v1.0.0.pdf (copied from https://workbench.cisecurity.org/files/3644/download/4588)

Test example from examples/CIS_Ubuntu_Linux_16.04_LTS_Benchmark_v1.0.0.pdf not working File is probably corrupted, I can't open it on mac.

docker run -it -v$(pwd):/share mitre/inspec_tools pdf2inspec -p pdf2inspec -p examples/CIS_Ubuntu_Linux_16.04_LTS_Benchmark_v1.0.0.pdf -o test

Traceback (most recent call last):
    21: from /usr/local/bundle/bin/inspec_tools:23:in `<main>'
    20: from /usr/local/bundle/bin/inspec_tools:23:in `load'
    19: from /usr/local/bundle/gems/inspec_tools-3.1.0.pre1/exe/inspec_tools:14:in `<top (required)>'
    18: from /usr/local/bundle/gems/inspec-core-4.38.9/lib/inspec/base_cli.rb:35:in `start'
    17: from /usr/local/bundle/gems/thor-1.1.0/lib/thor/base.rb:485:in `start'
    16: from /usr/local/bundle/gems/thor-1.1.0/lib/thor.rb:392:in `dispatch'
    15: from /usr/local/bundle/gems/thor-1.1.0/lib/thor/invocation.rb:127:in `invoke_command'
    14: from /usr/local/bundle/gems/thor-1.1.0/lib/thor/command.rb:27:in `run'
    13: from /usr/local/bundle/gems/inspec_tools-3.1.0.pre1/lib/inspec_tools/plugin_cli.rb:140:in `pdf2inspec'
    12: from /usr/local/bundle/gems/inspec_tools-3.1.0.pre1/lib/inspec_tools/pdf.rb:28:in `to_inspec'
    11: from /usr/local/bundle/gems/inspec_tools-3.1.0.pre1/lib/inspec_tools/pdf.rb:100:in `read_pdf'
    10: from /usr/local/bundle/gems/inspec_tools-3.1.0.pre1/lib/inspec_tools/pdf.rb:100:in `new'
     9: from /usr/local/bundle/gems/inspec_tools-3.1.0.pre1/lib/utilities/extract_pdf_text.rb:8:in `initialize'
     8: from /usr/local/bundle/gems/inspec_tools-3.1.0.pre1/lib/utilities/extract_pdf_text.rb:14:in `read_text'
     7: from /usr/local/bundle/gems/inspec_tools-3.1.0.pre1/lib/utilities/extract_pdf_text.rb:14:in `new'
     6: from /usr/local/bundle/gems/pdf-reader-2.5.0/lib/pdf/reader.rb:117:in `initialize'
     5: from /usr/local/bundle/gems/pdf-reader-2.5.0/lib/pdf/reader.rb:117:in `new'
     4: from /usr/local/bundle/gems/pdf-reader-2.5.0/lib/pdf/reader/object_hash.rb:45:in `initialize'
     3: from /usr/local/bundle/gems/pdf-reader-2.5.0/lib/pdf/reader/object_hash.rb:45:in `new'
     2: from /usr/local/bundle/gems/pdf-reader-2.5.0/lib/pdf/reader/xref.rb:61:in `initialize'
     1: from /usr/local/bundle/gems/pdf-reader-2.5.0/lib/pdf/reader/xref.rb:100:in `load_offsets'
/usr/local/bundle/gems/pdf-reader-2.5.0/lib/pdf/reader/buffer.rb:137:in `find_first_xref_offset': PDF does not contain EOF marker (PDF::Reader::MalformedPDFError)