Closed tiffb closed 3 years ago
Duplicate mappings can be detected in the parser by using the joined_id as a unique identifier of the mapping. Ideally a warning will be printed when a duplicate is detected.
Culling of duplicates in the parser will lead to loss of data since the input spreadsheet may include different descriptions for parallel mappings.
The issue stems from the original input spreadsheets considering mappings at the mitigation level for mapping context. This led to unions for the output mappings in order to capture everything. An unfortunate result is this duplication. The input will not typically be exact duplicates; an example: M1017 T1204(.001|.002)? CM-(2|6) Baseline Configuration, Configuration Settings M1031 T1204(.001)? CM-(2|6|7) Baseline Configuration, Configuration Settings, Least Functionality
The description itself is not terribly important. That field in fact was optional when completing the mappings. The language in there may also reference multiple mappings. For example, consider this input: T1204(.001)? CM-(2|6) Baseline Configuration, Configuration Settings Technically, CM-2 is Baseline Configuration and CM-6 is Configuration Settings. Yet, the mappings for CM-6 include Baseline Configuration (and all of the other descriptors from other grouped CM mappings) in the output spreadsheet.
Merged pull request #59 from center-for-threat-informed-defense/bugs/#58-duplicate-mappings
Output spreadsheet contains duplicate technique to control mappings.