Refactoring the acquisition of the test number from the parsers.

giper45 commented 1 year ago

The following PR perform a refactoring about the benchmark test acquisition from the parsers. The last code version had a testNumber() helper method that receives a string and returns the test number. Anyway, not all the parsers used this helper method. In this refactoring, I have extracted all the logics that find the test number from the strings, added new test cases to the helper test of the testNumber() method and use the method in each test case.

In this way, a bug is easily fixable, and it is possible to implement alternative approaches to define a test id number without altering the code of the parsers.

Unfortunately, the test code does not cover all the parsers, and some class was not testable. I left some comment in the code when I was not completely sure about the correcteness of the testNumber output.

Anyway, the function covers a lot of cases, as it is possible to view in the ReaderTest function. As suggestion for new parsers, adds always the parsing logic in the helper function.

Another possible refactoring for special cases would be to override the method in children classes. But all the parsers should use this method, otherwise the parsing method is not testable and it is not possible to generalize the test id generation.

giper45 commented 1 year ago

I have found some bugs when running the createScorecard.sh script that are not covered by Unit test. I proceed to add these test cases and fix the testNumber method.

giper45 commented 1 year ago

Ok, now I have performed another check with the createScorecard.sh script.
As you can see in the following image (to the left the main utils branch, to the right my PR):

the tests give the same results. This confirm that (for the proposed result file) this PR does not break anything. Anyway, more tests should be performed after adding other result file.

davewichers commented 1 year ago

@darkspirit510 -- Do you have time to review/comment on this PR?

darkspirit510 commented 1 year ago

@davewichers if this can wait until sunday, I can review it.

darkspirit510 commented 1 year ago

I like this refactoring (actually one of the things on the list I want to fix in the project)!

Unfortunately, the test code does not cover all the parsers, and some class was not testable. That's a big issue, I know. When I started to contribute to this project in 2020, there wasn't a single test 🙈 some Parsers seem to be discontinued, for commercial products it's hard to get result files. But at least for the existing tests, this looks good.

Two general things:

Some readers use a try/catch-block around the parseInt call. Now this catch block is unreachable, because IF there has been any exception, it's caught by testNumber and -1 is returned. So if testNumber is used, the try/catch block can be removed completly.
Personal opinion (and I'd like to hear @davewichers comment about that): If there is a value written to a variable, just to be passed to another method in the next line and never used again, this feels like waste to me. So instead of

int testno = testNumber(testcase);
tcr.setNumber(testno);

I prefer

tcr.setNumber(testNumber(testcase));

At least for those short cases.

giper45 commented 1 year ago

Hi @darkspirit510 , thank for your comment!. If @davewichers agree with your comments in next days I will proceed to make these little changes.

davewichers commented 1 year ago

@giper45 - Yes - I agree. Please make the suggested changes and ask @darkspirit510 to review them when done. Thanks for contributing!

giper45 commented 1 year ago

Hi @davewichers , you are welcome! I have performed the required refactoring. Please @darkspirit510 , check if everything is ok.

darkspirit510 commented 1 year ago

Thank you for applying my suggestions. Let's just discuss your NOTE. After this, I don't see a reason why this shouldn't be merged.

Edit: I did not know that I have to "release" my comments 🙈

davewichers commented 1 year ago

I provided a few individual comments. We also need to make sure we have good regression testing of ALL parsers that have been changed in this pull request. And we should NOT be changing the logic of when a new test case result it returned or not. We should only be changing how the parsing of the test case number out of the finding is done.

giper45 commented 1 year ago

Hi guys, I will try to handle your comments in the WE.

giper45 commented 1 year ago

Thank you for all your comments, I have tried to change the code in order to align with your considerations. Anyway, as your recommendations are focused on covering all the test cases, a personal suggestion would be to remove all the obsolete / useless parsers and focuses on updated parsers. In this way, it would be possible to improve the test coverage and mantain the codebase better.

sebsnyk commented 1 year ago

@davewichers @giper45 I think this is a great PR, and we could use it for future work.

@giper45 do you need further decisions to continue? There's also a commented code block that could be removed?

giper45 commented 1 year ago

Hi @sebsnyk, thank you for the suggestion about the code block, I have removed it. And thank you for your comment, I just want to add two things:

About potential regression bugs: it is true that several sections were moved into the testNumber() method, but now it is much simpler to fix potential bugs: just add a new test case to the ReaderTest class and run the mvn test command.
If you integrate the PR, I have implemented another interesting feature that I think it could be of interest. In particular, I am using the OwaspBenchmarkUtils code to test against other benchmarks. If all the parsers use testNumber(), it is possible to add an autogenerated id that can be used with other benchmarks that have not the same logic of the OwaspBenchmark (i.e., the presence of the BenchmarkTest string in the test case). I will try to update the PR to the current branch and to provide you another PR if this will be accepted.

Regards

sebsnyk commented 1 year ago

Hi @giper45 thank you for the details.

In addition to your central ID handling, I am currently looking at generic relative paths instead of a Benchmark name as a potential refactor. Not all projects use a naming scheme like this Benchmark, but instead may have vulnerabilities spread across different files and file formats. :) I see this PR as a great foundation.

davewichers commented 1 year ago

@darkspirit510 - Can you review all this and let me know what you think?

@giper45 - I just merged a small contribution to the HCLAppScanSourceReader. Can you resolve that merge conflict?

giper45 commented 1 year ago

Hi @davewichers , I have merged. Please decide if it is possible to merge this feature, otherwise conflicts will often occur. Two other notes:

When you will add new parsers, or changes to the parsers, use always the testNumber() method
Always add test strings in the ReaderTest class, especially if some issue will occur. In this way we can increase the reliability of the testNumber parsing.

giper45 commented 1 year ago

Hi @sebsnyk take a look at this (possible further PR after this): https://github.com/giper45/BenchmarkUtils/tree/feature-autogenerate-testid

It should cover the cases in which there is another relative test case name.

I still have to merge in the new development of this PR.

darkspirit510 commented 1 year ago

@davewichers I already reviewed this a while ago and think it can be merged. All the commits in the meantime were fixing merge conflicts.

OWASP-Benchmark / BenchmarkUtils

Refactoring the acquisition of the test number from the parsers. #35