microsoft / sarif-tools

A set of Python command line tools for working with SARIF files produced by code analysis tools
MIT License
91 stars 21 forks source link

Fix issue grouping bug that gave different results based on ordering #70

Closed debonte closed 1 month ago

debonte commented 2 months ago

Fix https://github.com/microsoft/sarif-tools/issues/68

We weren't updating key_and_count["common_desc"] when we determined that the common description substring for a particular code needed to be reduced. key_and_count["key"] was updated with the new substring, but key_and_count["common_desc"] was not. This meant that the last time we ran through this logic would determine what the key was, and therefore it was order-dependent.

Additionally, when checking if key_and_count["common_desc"] matches the current record, we should be checking if record["Description"] starts with the common_desc, not if they are equal.

I considered putting the unit test for this into the existing test_diff_op.py, but with all the SARIF text in there, it was going to get difficult to read. I think it'll be easier to understand these tests if we have one test per file. Happy to discuss. Also wondering if anyone has other ideas on naming these files or the test methods? For example, should regression test names include the corresponding GitHub issue number for reference?