mitre / heimdall_tools

DEPRECATED: A set of utilities for converting and working with compliance data for viewing in the heimdall applications
https://heimdall-tools.mitre.org
Other
35 stars 20 forks source link

Sarif to HDF converter #93

Closed shaopeng-gh closed 3 years ago

aaronlippold commented 3 years ago

Very cool. I'll assign folks to review

eddynaka commented 3 years ago

@shaopeng-gh , please, attach in the comment thread the original flawfinder, the sarif and the hdf file.

shaopeng-gh commented 3 years ago

these are the files I used, from FlawFinder CSV convert to Sarif, then convert the result Sarif to HDF format, then then tried the HDF file loading in Heimdall website. FF to Sarif then to HDF.zip ---Old FF to Sarif then to HDF v2.zip --- Old FF to Sarif then to HDF v3.zip --- Old FF to Sarif then to HDF v4.zip

https://heimdall-lite.cms.gov/

ejaronne commented 3 years ago

Which HDF output should we review that is produced by this mapper?:

The HDF.json shows results like this: URL : test/test-patched.c

Whereas sarif_output.json shows results as: URL : test/test-patched.c LINE : 64 COLUMN : 3 (The sarif_output.json version seems to make heimdall-lite freeze-up for some reason...)

shaopeng-gh commented 3 years ago

@ejaronne good catch, the sarif_output.json in source code is latest, I added the LINE and COLUMN as well. Not very familiar with HDF format, let me know if you have any suggestions.

sarif_output.json version seems to make heimdall-lite freeze-up ------- I just tried again using the one in source code, load successfully in one second for me, are you using this link or there is another tool https://heimdall-lite.cms.gov/

Bialogs commented 3 years ago

I noticed that the sarif file you sent over does not have any "level": "none" examples. Can you explain what type of finding by the analyzer would produce a none so we can validate that it is the correct impact level of 0?

eddynaka commented 3 years ago

I noticed that the sarif file you sent over does not have any "level": "none" examples. Can you explain what type of finding by the analyzer would produce a none so we can validate that it is the correct impact level of 0?

Hi @Bialogs , FailureLevel = None means we don't have a failure level. Something unset.

Do we have something similar to HDF?

Bialogs commented 3 years ago

I noticed that the sarif file you sent over does not have any "level": "none" examples. Can you explain what type of finding by the analyzer would produce a none so we can validate that it is the correct impact level of 0?

Hi @Bialogs , FailureLevel = None means we don't have a failure level. Something unset.

Do we have something similar to HDF?

Sounds like Not Reviewed to me but @ejaronne should probably weigh in.

shaopeng-gh commented 3 years ago

For example, FF1023 will be created several times within the generated HDF JSON. What I think would be preferable is a single FF1023 with all the different instances of FF1023 violation being included under results.

thanks, will work on this one.

The actual code that failed would be included under HDF's "code" object. Right now you cannot view the offending line.

HDF's "code" object I can't find a sample in https://github.com/mitre/heimdall_tools/tree/master/sample_jsons, this field is to store the actual code content right? I see some samples put the url in "code_desc" so I put he url, row, column in "code_desc".

rbclark commented 3 years ago

The actual code that failed would be included under HDF's "code" object. Right now you cannot view the offending line.

HDF's "code" object I can't find a sample in https://github.com/mitre/heimdall_tools/tree/master/sample_jsons, this field is to store the actual code content right? I see some samples put the url in "code_desc" so I put he url, row, column in "code_desc".

Looking at the SARIF format, it seems it includes a text snippet which shows the line of code in question. That snippet seems like it would be very useful as part of the code_desc. With the previous change Kyle requested it doesn't make much sense to put it under the other "Code" location.

shaopeng-gh commented 3 years ago

For example, FF1023 will be created several times within the generated HDF JSON. What I think would be preferable is a single FF1023 with all the different instances of FF1023 violation being included under results.

thanks, will work on this one.

I have fixed this one and send a new commit. the json size is greatly reduced.

FF to Sarif then to HDF v2.zip

Bialogs commented 3 years ago

v2 is looking really good. Now I have one additional question and some things I need to investigate on our end.

Question: "contextHash/v1" in SARIF is unique per FF#? Can we fit that into our HDF concept of "sha256"

Comment: CWE tags are not showing up in the description when viewed with Heimdall-lite viewer. Edit: This is an issue with the viewer, not the way you've represented the data so I will make an issue with Heimdall.

Comment: I also need to look into the Not Applicable case even though it has failing tests within the result. Edit2: I will suggest a change for this so we change the tests within a Not Applicable to skipped. Let me know if this makes sense

shaopeng-gh commented 3 years ago

Question: "contextHash/v1" in SARIF is unique per FF#? Can we fit that into our HDF concept of "sha256"

I just checked HDF concept of "sha256" ,it is at document level and one per document, I think it is always auto generated, not user assigned, by this common line for all converters: profile_block['sha256'] = OpenSSL::Digest::SHA256.digest(profile_block.to_s).unpack1('H*')

the contextHash/v1 you see in this Sarif is from "Fingerprint" column of FlawFinder result CSV, and it is more than 1 per document .

shaopeng-gh commented 3 years ago

I have made the "skipped" change.

FF to Sarif then to HDF v3.zip

aaronlippold commented 3 years ago

When all the sub-tests are skipped the overall status in Heimdall should be Skipped not Not Applicable.

Not Applicable is a specific context of control state which is an agreed upon condition as to when the tests do not apply to the system - not just that no tests were executed.

For example, on a system that does not have a GUI installed, a login banner for the GUI would be Not Applicable - when the system doesn't use pkcs11 for authentication, the tests to check for the packages for the PKCS#11 sub-systems would be Not Applicable. etc.

In your case, just because you didn't execute tests, the best you can say is Skipped not that you have a validated case that the tests are out of context.

Please update it so a set of sub-test results are all skipped - the control status is set to skipped.

Aaron Lippold

@.***

260-255-4779

twitter/aim/yahoo,etc. 'aaronlippold'

On Tue, May 4, 2021 at 5:18 PM Shaopeng @.***> wrote:

I have made the "skipped" change.

FF to Sarif then to HDF v3.zip https://github.com/mitre/heimdall_tools/files/6423995/FF.to.Sarif.then.to.HDF.v3.zip

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/mitre/heimdall_tools/pull/93#issuecomment-832255233, or unsubscribe https://github.com/notifications/unsubscribe-auth/AALK42DKL7QB2KN3ITOJJVLTMBQAXANCNFSM43X7XXRQ .

shaopeng-gh commented 3 years ago

Thanks for the comments!

When all the sub-tests are skipped the overall status in Heimdall should be Skipped not Not Applicable. ... Please update it so a set of sub-test results are all skipped - the control status is set to skipped.

Question,

  1. do you mean for any control, if all results of that control have status = skipped like below, the control object itself should have a status as skipped: { "id": "config-rule-trm0sx", "title": "060708420889 - vpc-sg-open-only-to-authorized-ports", ... "results": [ { ... "status": "skipped" }, { ... "status": "skipped" } ] }

  2. but I don't see any status tag at the control level, in the schema https://github.com/mitre/inspecjs/blob/master/schemas/exec-json.json

  3. I see there is another status tag at the profile level but it is hard coded as status: 'loaded' in hdf.rb, don't think it is related

  4. I tried to search the string "Not Applicable" in source code, did not find any result

rbclark commented 3 years ago

Please update it so a set of sub-test results are all skipped - the control status is set to skipped.

There is no concept of a control status in HDF, it is calculated by Heimdall automatically based on the values of the sub-tests. The logic in place is as follows:

if all subtests are skipped:
   if impact is 0:
       control is Not Applicable
   else
       control is Not Reviewed
aaronlippold commented 3 years ago

Yes so I guess what I’m saying is don’t set the impact to zero

On Wed, May 5, 2021 at 8:55 AM Robert Clark @.***> wrote:

Please update it so a set of sub-test results are all skipped - the control status is set to skipped.

There is no concept of a control status in HDF, it is calculated by Heimdall automatically based on the values of the sub-tests. The logic in place is as follows:

if all subtests are skipped: if impact is 0: control is Not Applicable else control is Not Reviewed

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/mitre/heimdall_tools/pull/93#issuecomment-832664003, or unsubscribe https://github.com/notifications/unsubscribe-auth/AALK42HBEIRS2DTCBA3KQ7LTME54ZANCNFSM43X7XXRQ .

--

Aaron Lippold

@.***

260-255-4779

twitter/aim/yahoo,etc. 'aaronlippold'

aaronlippold commented 3 years ago

Given that a set of all skipped tests don’t provide you data one way or the other, in other words you didn’t run a process so you don’t know if you’re good or you’re bad the best you can say is that you didn’t test.

Not applicable is the condition where you have a known condition Which you can validate which changes the scope of the requirement. As in my other example, I can test that the GUI is not installed on the server therefore this requirement doesn’t apply.

On Wed, May 5, 2021 at 9:33 AM Aaron Lippold @.***> wrote:

Yes so I guess what I’m saying is don’t set the impact to zero

On Wed, May 5, 2021 at 8:55 AM Robert Clark @.***> wrote:

Please update it so a set of sub-test results are all skipped - the control status is set to skipped.

There is no concept of a control status in HDF, it is calculated by Heimdall automatically based on the values of the sub-tests. The logic in place is as follows:

if all subtests are skipped: if impact is 0: control is Not Applicable else control is Not Reviewed

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub <https://github.com/mitre/heimdall_tools/pull/93#issuecomment-832664003 , or unsubscribe < https://github.com/notifications/unsubscribe-auth/AALK42HBEIRS2DTCBA3KQ7LTME54ZANCNFSM43X7XXRQ

.

--

Aaron Lippold

@.***

260-255-4779

twitter/aim/yahoo,etc. 'aaronlippold'

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/mitre/heimdall_tools/pull/93#issuecomment-832690779, or unsubscribe https://github.com/notifications/unsubscribe-auth/AALK42GDA5EQ3DT3FCU2PIDTMFCK7ANCNFSM43X7XXRQ .

--

Aaron Lippold

@.***

260-255-4779

twitter/aim/yahoo,etc. 'aaronlippold'

shaopeng-gh commented 3 years ago

thanks for the explanation, so my understanding is, with insight of the logic how Heimdall show "Not Reviewed" vs "Not Applicable", the the goal is: when a control all results are skipped we want Heimdall tool to show as "Not Reviewed".

Is this the change I should make: for any control, if all results are skipped --if current impact is 0, hard code assign a value e.g. 0.1 --if current impact is not 0, leave it as it is.

this code I will add after all Sarif results are processed since we will not know if a control all results are skipped, until every result is loop trough and proceeded. it will be like a post processing.

let me know what the hard code value should be. the current impact in HDF file comes from Sarif "level" tag, in the Sarif the level is optional.

aaronlippold commented 3 years ago

Where are you getting the initial value of impact?

On Wed, May 5, 2021 at 1:30 PM Shaopeng @.***> wrote:

thanks for the explanation, so my understanding is, with insight of the logic how Heimdall show "Not Reviewed" vs "Not Applicable", the the goal is: when a control all results are skipped we want Heimdall tool to show as "Not Reviewed".

Is this the change I should make: for any control, if all results are skipped --if current impact is 0, hard code assign a value e.g. 0.1 --if current impact is not 0, leave it as it is.

this code I will add after all Sarif results are processed since we will not know if a control all results are skipped, until every result is loop trough and proceeded. it will be like a post processing.

let me know what the hard code value should be. the current impact in HDF file comes from Sarif "level" tag, in the Sarif the level is optional.

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/mitre/heimdall_tools/pull/93#issuecomment-832875498, or unsubscribe https://github.com/notifications/unsubscribe-auth/AALK42CPLIK7FT6W7V47ECLTMF6C5ANCNFSM43X7XXRQ .

--

Aaron Lippold

@.***

260-255-4779

twitter/aim/yahoo,etc. 'aaronlippold'

shaopeng-gh commented 3 years ago

Where are you getting the initial value of impact? On Wed, May 5, 2021 at 1:30 PM Shaopeng @.> wrote: thanks for the explanation, so my understanding is, with insight of the logic how Heimdall show "Not Reviewed" vs "Not Applicable", the the goal is: when a control all results are skipped we want Heimdall tool to show as "Not Reviewed". Is this the change I should make: for any control, if all results are skipped --if current impact is 0, hard code assign a value e.g. 0.1 --if current impact is not 0, leave it as it is. this code I will add after all Sarif results are processed since we will not know if a control all results are skipped, until every result is loop trough and proceeded. it will be like a post processing. let me know what the hard code value should be. the current impact in HDF file comes from Sarif "level" tag, in the Sarif the level is optional. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#93 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AALK42CPLIK7FT6W7V47ECLTMF6C5ANCNFSM43X7XXRQ . -- -------- Aaron Lippold @. 260-255-4779 twitter/aim/yahoo,etc. 'aaronlippold'

the code we are working on in this repo converts from a Sarif format file to HDF format. the Sarif format can be produced from different tools. the current "impact" in HDF file comes from Sarif "level" tag, in the Sarif the level is optional. And Author of the Sarif file can choose to not provide it. What "impact" should we map it to? currently it is set to 0. I see this impact tag in HDF definition is from zero to one and is required.

aaronlippold commented 3 years ago

Hi, I understand. Usually what we do is we decide if an impact is not provided do we defaulted to either low or moderate. Given that in our case zero has a very specific use case behind it. I would guess you are use case is much like the other use cases we’ve had where moderate is good enough. In other words defaulted to 0.5, and if the community decides that they think one or more of the data points should be a 0.3 or a 0.7 they can open an issue

On Wed, May 5, 2021 at 3:34 PM Shaopeng @.***> wrote:

Where are you getting the initial value of impact? On Wed, May 5, 2021 at 1:30 PM Shaopeng @. > wrote: thanks for the explanation, so my understanding is, with insight of the logic how Heimdall show "Not Reviewed" vs "Not Applicable", the the goal is: when a control all results are skipped we want Heimdall tool to show as "Not Reviewed". Is this the change I should make: for any control, if all results are skipped --if current impact is 0, hard code assign a value e.g. 0.1 --if current impact is not 0, leave it as it is. this code I will add after all Sarif results are processed since we will not know if a control all results are skipped, until every result is loop trough and proceeded. it will be like a post processing. let me know what the hard code value should be. the current impact in HDF file comes from Sarif "level" tag, in the Sarif the level is optional. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#93 (comment) https://github.com/mitre/heimdall_tools/pull/93#issuecomment-832875498>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AALK42CPLIK7FT6W7V47ECLTMF6C5ANCNFSM43X7XXRQ https://github.com/notifications/unsubscribe-auth/AALK42CPLIK7FT6W7V47ECLTMF6C5ANCNFSM43X7XXRQ . -- -------- Aaron Lippold @. 260-255-4779 twitter/aim/yahoo,etc. 'aaronlippold'

the code we are working on in this repo converts from a Sarif format file to HDF format. the Sarif format can be produced from different tools. the current "impact" in HDF file comes from Sarif "level" tag, in the Sarif the level is optional. And Author of the Sarif file can choose to not provide it. What "impact" should we map it to? currently it is set to 0. I see this impact tag in HDF definition is from zero to one and is required.

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/mitre/heimdall_tools/pull/93#issuecomment-832954353, or unsubscribe https://github.com/notifications/unsubscribe-auth/AALK42AKJPQ5GPD2NEBJCGDTMGMVTANCNFSM43X7XXRQ .

--

Aaron Lippold

@.***

260-255-4779

twitter/aim/yahoo,etc. 'aaronlippold'

shaopeng-gh commented 3 years ago

Hi, I understand. Usually what we do is we decide if an impact is not provided do we defaulted to either low or moderate. Given that in our case zero has a very specific use case behind it. I would guess you are use case is much like the other use cases we’ve had where moderate is good enough. In other words defaulted to 0.5, and if the community decides that they think one or more of the data points should be a 0.3 or a 0.7 they can open an issue On Wed, May 5, 2021 at 3:34 PM Shaopeng @.> wrote: Where are you getting the initial value of impact? On Wed, May 5, 2021 at 1:30 PM Shaopeng @. > wrote: thanks for the explanation, so my understanding is, with insight of the logic how Heimdall show "Not Reviewed" vs "Not Applicable", the the goal is: when a control all results are skipped we want Heimdall tool to show as "Not Reviewed". Is this the change I should make: for any control, if all results are skipped --if current impact is 0, hard code assign a value e.g. 0.1 --if current impact is not 0, leave it as it is. this code I will add after all Sarif results are processed since we will not know if a control all results are skipped, until every result is loop trough and proceeded. it will be like a post processing. let me know what the hard code value should be. the current impact in HDF file comes from Sarif "level" tag, in the Sarif the level is optional. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#93 (comment) <#93 (comment)>>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AALK42CPLIK7FT6W7V47ECLTMF6C5ANCNFSM43X7XXRQ https://github.com/notifications/unsubscribe-auth/AALK42CPLIK7FT6W7V47ECLTMF6C5ANCNFSM43X7XXRQ . -- -------- Aaron Lippold @. 260-255-4779 twitter/aim/yahoo,etc. 'aaronlippold' the code we are working on in this repo converts from a Sarif format file to HDF format. the Sarif format can be produced from different tools. the current "impact" in HDF file comes from Sarif "level" tag, in the Sarif the level is optional. And Author of the Sarif file can choose to not provide it. What "impact" should we map it to? currently it is set to 0. I see this impact tag in HDF definition is from zero to one and is required. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#93 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AALK42AKJPQ5GPD2NEBJCGDTMGMVTANCNFSM43X7XXRQ . -- -------- Aaron Lippold @. 260-255-4779 twitter/aim/yahoo,etc. 'aaronlippold'

Sounds perfect to me! I will make this change now, thanks.

aaronlippold commented 3 years ago

Great, and then usually in the read me for that particular converter we just let them know the rules of engagement that we put in place. In your case it would be if the data provider set a level we converted level X to low and level wide to moderate if there was no value provided we defaulted to moderate. Something along those lines

On Wed, May 5, 2021 at 3:40 PM Shaopeng @.***> wrote:

Hi, I understand. Usually what we do is we decide if an impact is not provided do we defaulted to either low or moderate. Given that in our case zero has a very specific use case behind it. I would guess you are use case is much like the other use cases we’ve had where moderate is good enough. In other words defaulted to 0.5, and if the community decides that they think one or more of the data points should be a 0.3 or a 0.7 they can open an issue On Wed, May 5, 2021 at 3:34 PM Shaopeng @. > wrote: Where are you getting the initial value of impact? On Wed, May 5, 2021 at 1:30 PM Shaopeng @. > wrote: thanks for the explanation, so my understanding is, with insight of the logic how Heimdall show "Not Reviewed" vs "Not Applicable", the the goal is: when a control all results are skipped we want Heimdall tool to show as "Not Reviewed". Is this the change I should make: for any control, if all results are skipped --if current impact is 0, hard code assign a value e.g. 0.1 --if current impact is not 0, leave it as it is. this code I will add after all Sarif results are processed since we will not know if a control all results are skipped, until every result is loop trough and proceeded. it will be like a post processing. let me know what the hard code value should be. the current impact in HDF file comes from Sarif "level" tag, in the Sarif the level is optional. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#93 https://github.com/mitre/heimdall_tools/pull/93 (comment) <#93 (comment) https://github.com/mitre/heimdall_tools/pull/93#issuecomment-832875498>>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AALK42CPLIK7FT6W7V47ECLTMF6C5ANCNFSM43X7XXRQ https://github.com/notifications/unsubscribe-auth/AALK42CPLIK7FT6W7V47ECLTMF6C5ANCNFSM43X7XXRQ https://github.com/notifications/unsubscribe-auth/AALK42CPLIK7FT6W7V47ECLTMF6C5ANCNFSM43X7XXRQ https://github.com/notifications/unsubscribe-auth/AALK42CPLIK7FT6W7V47ECLTMF6C5ANCNFSM43X7XXRQ . -- -------- Aaron Lippold @. 260-255-4779 twitter/aim/yahoo,etc. 'aaronlippold' the code we are working on in this repo converts from a Sarif format file to HDF format. the Sarif format can be produced from different tools. the current "impact" in HDF file comes from Sarif "level" tag, in the Sarif the level is optional. And Author of the Sarif file can choose to not provide it. What "impact" should we map it to? currently it is set to 0. I see this impact tag in HDF definition is from zero to one and is required. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#93 (comment) https://github.com/mitre/heimdall_tools/pull/93#issuecomment-832954353>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AALK42AKJPQ5GPD2NEBJCGDTMGMVTANCNFSM43X7XXRQ https://github.com/notifications/unsubscribe-auth/AALK42AKJPQ5GPD2NEBJCGDTMGMVTANCNFSM43X7XXRQ . -- -------- Aaron Lippold @. 260-255-4779 twitter/aim/yahoo,etc. 'aaronlippold'

Sounds perfect to me! I will make this change now, thanks.

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/mitre/heimdall_tools/pull/93#issuecomment-832957442, or unsubscribe https://github.com/notifications/unsubscribe-auth/AALK42F45RWBXWP6M7BW6G3TMGNKXANCNFSM43X7XXRQ .

--

Aaron Lippold

@.***

260-255-4779

twitter/aim/yahoo,etc. 'aaronlippold'

aaronlippold commented 3 years ago

Is the level of none here really more equivalent to 0.1 rather than 0.0. 0.0 in HDF means not applicable

shaopeng-gh commented 3 years ago

publish a new version. FF to Sarif then to HDF v4.zip

shaopeng-gh commented 3 years ago

Is the level of none here really more equivalent to 0.1 rather than 0.0. 0.0 in HDF means not applicable

in Sarif:

· "warning": The rule specified by ruleId was evaluated and a problem was found. · "error": The rule specified by ruleId was evaluated and a serious problem was found. · "note": The rule specified by ruleId was evaluated and a minor problem or an opportunity to improve the code was found. · "none": The concept of “severity” does not apply to this result

let me know if you think 0.1 is better, I can make the change.

Bialogs commented 3 years ago

After discussion internally and reading your comments again I just made the change to default to 0.1 and revert my earlier suggestion to bucket as Skipped. That was confusion on my part.