Crash in convert_to_sage_obj when processing many non-trivial repositories

dgrove-oss commented 11 months ago

I've been trying to use ansible-content-parser to construct fine tuning data for a number of collections and roles in Galaxy. Although it works correctly for the example suggested in the readme (git@github.com:ansible/workshop-examples.git), for each of the Galaxy repos I've tried it on, the program has crashed with a traceback from Sage scan.

355, in convert_to_sage_obj
    raise ValueError(f"{type(ari_obj)} is not a supported type for Sage objects")
ValueError: <class 'ansible_risk_insight.models.File'> is not a supported type for Sage objects

The one below is typical; trying to run the tool against geerlingguy/ansible-role-docker.git

Is there something I'm missing in how to invoke the tool?

(wisdom-parser) dgrove@Dave's IBM Mac ft-testing % ansible-content-parser git@github.com:geerlingguy/ansible-role-docker.git gg-docker
WARNING  Listing 57 violation(s) that are fatal
jinja[spacing]: Jinja2 spacing could be improved: {{ docker_repo_url }}/{{ (ansible_distribution == 'Fedora') | ternary('fedora','centos') }}/docker-{{ docker_edition }}.repo -> {{ docker_repo_url }}/{{ (ansible_distribution == 'Fedora') | ternary('fedora', 'centos') }}/docker-{{ docker_edition }}.repo (warning)
defaults/main.yml:48 Jinja2 template rewrite recommendation: `{{ docker_repo_url }}/{{ (ansible_distribution == 'Fedora') | ternary('fedora', 'centos') }}/docker-{{ docker_edition }}.repo`.

.... details of the 57 warnings elided...

Read documentation for instructions on how to ignore specific rule violations.
WARNING  You specified '--fix', but no files can be modified because 'yaml' is in 'skip_list'.

                         Rule Violation Summary                          
 count tag                       profile    rule associated tags         
     1 command-instead-of-module basic      command-shell, idiom         
     1 key-order[task]           basic      formatting                   
     2 literal-compare           basic      idiom                        
     3 jinja[spacing]            basic      formatting (warning)         
     2 no-free-form              basic      syntax, risk                 
     1 schema[meta]              basic      core                         
     5 name[missing]             basic      idiom                        
     1 name[casing]              moderate   idiom                        
     1 no-changed-when           shared     command-shell, idempotency   
    37 fqcn[action-core]         production formatting                   
     2 fqcn[action]              production formatting                   
     1 warning[outdated-tag]                core, experimental (warning) 

Failed: 53 failure(s), 4 warning(s) on 22 files. Last profile that met the validation criteria was 'min'.
WARNING  Following files are excluded from training set generation due to ansible-lint rule violations: handlers/main.yml,meta/main.yml,molecule/default/converge.yml,tasks/docker-compose.yml,tasks/docker-users.yml,tasks/main.yml,tasks/setup-Debian.yml,tasks/setup-RedHat.yml
Writing summary of sarif.json to /Users/dgrove/code/wisdom/ft-testing/gg-docker/metadata/sarif_summary.txt
INFO     Running data pipeline
INFO     Start scanning for 1 projects (total 9 files)
Traceback (most recent call last):
  File "/opt/homebrew/Caskroom/miniconda/base/envs/wisdom-parser/bin/ansible-content-parser", line 8, in <module>
    sys.exit(main())
  File "/Users/dgrove/code/wisdom/ansible-content-parser/src/ansible_content_parser/__main__.py", line 305, in main
    return_code = run_pipeline(args, repository_path)
  File "/Users/dgrove/code/wisdom/ansible-content-parser/src/ansible_content_parser/pipeline.py", line 40, in run_pipeline
    dp.run(
  File "/opt/homebrew/Caskroom/miniconda/base/envs/wisdom-parser/lib/python3.10/site-packages/sage_scan/pipeline.py", line 415, in run
    self._multi_stage_scan(input_list)
  File "/opt/homebrew/Caskroom/miniconda/base/envs/wisdom-parser/lib/python3.10/site-packages/sage_scan/pipeline.py", line 473, in _multi_stage_scan
    self.scan(start, input_data)
  File "/opt/homebrew/Caskroom/miniconda/base/envs/wisdom-parser/lib/python3.10/site-packages/sage_scan/pipeline.py", line 712, in scan
    sage_obj = convert_to_sage_obj(ari_obj, source)
  File "/opt/homebrew/Caskroom/miniconda/base/envs/wisdom-parser/lib/python3.10/site-packages/sage_scan/models.py", line 355, in convert_to_sage_obj
    raise ValueError(f"{type(ari_obj)} is not a supported type for Sage objects")
ValueError: <class 'ansible_risk_insight.models.File'> is not a supported type for Sage objects

dgrove-oss commented 11 months ago

Note that neither --skip-ansible-lint or --fix==none avoid the crash. In both cases we still get to File not being a supported type for Sage objects.

TamiTakamiya commented 10 months ago

@dgrove-oss Sorry for not replying in a timely manner. Yes, this is an issue. I have opened a PR (#26) to address this issue.

TamiTakamiya commented 10 months ago

@dgrove-oss This issue was fixed by a dependent library (sage-scan). Would you try to run the tool again after updating sage-scan to version 0.0.2? Thank you!

dgrove-oss commented 10 months ago

I confirmed that if I updated sage-scan and ansible-risk-insight to 0.0.2 and 0.2.4 respectively then the problem is fixed and ft.json is generated as expected. Thanks @TamiTakamiya !

ansible / ansible-content-parser

Crash in convert_to_sage_obj when processing many non-trivial repositories #24