fkie-cad / cwe_checker

cwe_checker finds vulnerable patterns in binary executables
https://docs.cwe-checker.io
GNU Lesser General Public License v3.0
1.1k stars 115 forks source link

Check for CWE 252 #451

Closed vobst closed 5 months ago

vobst commented 6 months ago

This PR tracks the progress on the addition of a check for CWE252. It is based on the Taint Analysis abstractions introduced by PR #450. While in draft state, it will be rebased and force-pushed without notice!

Use the new Taint Analysis abstraction to implement a check for CWE252. The check performs a limited interprocedural analysis.

TODO

Enkelmann commented 6 months ago

In the short term, the huge list of symbol names in lkm_config.json is probably OK. In the long term, we still need to find a better solution, especially since the list may go out of sync with newer Linux kernel versions over time. Maybe provide or document a way how to extract the function list from Linux kernel sources? On the other hand, that would be more inconvenient for users, because they would need to do it before running the cwe_checker...

vobst commented 6 months ago

I am also not super happy with the huge list in the repo, maybe we can start with a trimmed down list, e.g., only API functions that are defined by the core kernel (~4k).

Maybe provide or document a way how to extract the function list from Linux kernel sources? On the other hand, that would be more inconvenient for users, because they would need to do it before running the cwe_checker...

Currently the process I use to generate this list is not something I'd consider ready for publication. Maybe one can do something more elegant but that would be a bigger project.

In the short term, the huge list of symbol names in lkm_config.json is probably OK. In the long term, we still need to find a better solution, especially since the list may go out of sync with newer Linux kernel versions over time.

The list getting out-of-sync isn't that much of a problem though - in the end we want to be able to analyze modules that were compiled for all sorts of kernel versions. Thus, I guess in the long term we should periodically regenerate the list, but take the union with the previous one.

Enkelmann commented 6 months ago

I am also not super happy with the huge list in the repo, maybe we can start with a trimmed down list, e.g., only API functions that are defined by the core kernel (~4k).

I think that would be a good compromise. 4k is at least a lot less than 15k.

Currently the process I use to generate this list is not something I'd consider ready for publication. Maybe one can do something more elegant but that would be a bigger project.

Then let us postpone it for the time being and stick to using the resulting list in this PR. If it remains a bigger project, we could also think about moving it into a repository separate from the core cwe_checker code.

The list getting out-of-sync isn't that much of a problem though - in the end we want to be able to analyze modules that were compiled for all sorts of kernel versions. Thus, I guess in the long term we should periodically regenerate the list, but take the union with the previous one.

Good point!

vobst commented 6 months ago

I am also not super happy with the huge list in the repo, maybe we can start with a trimmed down list, e.g., only API functions that are defined by the core kernel (~4k).

I think that would be a good compromise. 4k is at least a lot less than 15k.

Turns out there is warn_unused_result, changed the config to include only the ~600 functions that have it.

vobst commented 5 months ago

Force push only affects the commit that rebases with master ... nothing relevant to the review.

lib/abstract_domain: add merge_with method to AbstractDomain has a commit message with further reasoning on the change. Please check if it makes sense or if I missed something.

vobst commented 5 months ago

Except for the MemoryTaintMergeStrategy everything looks fine.

This one should be fixed now.

At least I think so, because I had to change my review process due to the force-push.

Yea, sorry, didn't expect it would throw GH off that much.

There is still the question on whether we should use -O0 or -O2 for the acceptance tests. In the end I will follow your decision here if you want to keep -O2.

I've gone with O0 now since it means that more tests work (in the sense of reporting 9/9). I developed the tests with 02 in mind so I cannot tell if they still work as expected, only that the expected nr of warnings comes out in the end. I guess we will find out once we have more fine grained filtering that can also verify addresses and reasons.

I also modified one of the interprocedural test cases to cover the case where an object from the caller has to be merged with an offset. Seems to work as expected.

There also might be some old review comments that I have not yet marked as resolved. If some of them are resolved and I just missed it in the review, please tell me!

I think all open conversions should be resolved now, but double check for yourself.

Have a nice weekend!

vobst commented 5 months ago

Nothing left to nitpick for me, PR can be merged. Thanks for the great PR!

Thanks for taking the time to review this PR and all of the explanations along the way!

I'd now rearrange the commits a bit to make them more suitable for the permanent commit log, i.e., write proper commit messages and combine/reorder things that belong together. This will not change the content of any commit. Then I'd also add one commit that updates the changelog for the current release in the end. Sounds OK?

vobst commented 5 months ago

I have cleaned up the history and will merge it once the CI passes.

Is there an unwritten rule that the CHANGES.md must be onliners? I'd be adding the first multiline entry here :)

Enkelmann commented 5 months ago

Is there an unwritten rule that the CHANGES.md must be onliners? I'd be adding the first multiline entry here :)

No, that was just a habit. If you want to preserve the style, keep entries short (maybe 2 lines max?) and use the description texts of the referenced PRs for more detailed descriptions. But the style itself is only personal preference, you can change it up if you prefer a different style for the CHANGES.md.

vobst commented 5 months ago

Is there an unwritten rule that the CHANGES.md must be onliners? I'd be adding the first multiline entry here :)

No, that was just a habit. If you want to preserve the style, keep entries short (maybe 2 lines max?) and use the description texts of the referenced PRs for more detailed descriptions. But the style itself is only personal preference, you can change it up if you prefer a different style for the CHANGES.md.

Okay, just wanted to be sure. It still renders as one line in the UI, looks like the limit there is about 120 chars and I tend to wrap at 80.