Closed jas- closed 2 months ago
I'm not able to run your program directly as It's referencing additional things,
but I think I see what's wrong from reading reporting.gen_stig
.
You're calling that function with:
$ reporting.gen_stig 0 1721456521 1721456534 src/stigs/Solaris/11/V0216246.sh Elem:1,Item:2,Key:3,Value:4 Elem:one,Item:two,Key:three,Value:four Elem:1,Item:2,Key:3,Value:4
So args[4] is Elem:1,Item:2,Key:3,Value:4
$ args=(reporting.gen_stig 0 1721456521 1721456534 src/stigs/Solaris/11/V0216246.sh Elem:1,Item:2,Key:3,Value:4 Elem:one,Item:two,Key:three,Value:four Elem:1,Item:2,Key:3,Value:4)
# The function does this
$ errors=( ${args[4]//,/ } )
$ declare -p errors
declare -a errors=([0]="Elem:1" [1]="Item:2" [2]="Key:3" [3]="Value:4")
$ # Then uses errors like this
$ json @errors:raw[]??
{"errors":[Elem:1,Item:2,Key:3,Value:4]}
So it's passing values that aren't JSON to a :raw
value. :raw
is assuming
the input is already valid JSON, so it just emits it as-is. If you do the same
with :json
you'll get an error:
$ json @errors:json[]??
json.encode_json(): not all inputs are valid JSON: 'Elem:1' 'Item:2' 'Key:3' 'Value:4'
json(): Could not encode the value of argument '@errors:json[]??' as an array with 'json' values. Read from array-variable $errors.
␘
I think it would be a good idea to use :json
when developing, and only switch to :raw
at runtime once you know it's working in principle. (I guess it would make sense for json.bash to support an environment variable option something like JSON_BASH_DANGEROUSLY_DISABLE_VALIDATION
to make :json
act like :raw
.)
To correct this problem depends on how you expect errors to be. I guess they should be objects, so you'd need to something like the for loop you have in reporting._gen_stig_objects
to encode each entry as an object before outputting the errors array.
As an aside, I noticed you have json @${type}:raw[]
in reporting._gen_stig_objects
. I'd recommend against including a variable in the argument in this way, as it lets the $type
var do unexpected things, like reference the absolute path of a file, or an unexpected environment variable. It's OK in principle if you take care to sanitise the value, but best to avoid it if you can. There's some advice on this in the README section on security which might be helpful.
I notice you've got a few places where variables aren't quoted, which may lead to unexpected splitting too. Personally I've adopted the practice of using ${var:?}
syntax by default for variables in bash — this fails when a var is unset or empty. This way you have to consider whether you've verified something is set, and explicitly allow it to be empty, e.g. with ${var:-default}
.
BTW, in #15 I've added a feature to use ggrep
by default and let you choose a grep command. I don't know if you're able to ensure package providing GNU grep is available on the system's you're running your script on, but if you are then this could let you use :json
rather than :raw
.
Both comments will be useful. I think the problem I am seeing is from using multiple functions and the out=errors
or other global variable.
func_one()
{
local obj errors
local -a obj
objs=( ${@} )
for obj in ${objs[@]}; do
obj="${obj//:/=}"
out=errors json ...:string{}@obj
done
json @errors:json[]
}
func_two()
{
local -a args
local errors
args=( ${@} )
out=errors func_one "${args[@]}"
json @errors:json[]??
}
Looks like a local scope issue, because if I change errors
to anything else in func_two
it works.
Are there reserved names? I find it odd simply because I am using the local
keyword in each function to change the scope. Do you have a better way to pass the JSON objects from function to function?
Thanks for this example! Just to make sure I'm seeing the same behaviour as you, this is what I'm seeing:
$ func_one foo:a,bar:b
{"errors":[{"foo":"a","bar":"b"}]}
$ func_two foo:a,bar:b
json(): Could not process argument '@errors:json[]??'. Its value references unbound variable $errors. (Use the '~' flag after the :type to treat a missing value as empty.)
␘
The problem here is that when func_two calls fun_one, the errors var in func_one is (correctly) masking the errors var in the outer func_two scope. The final json
call in func_one sees out=errors
and appends its output to the errors array var in func_one scope, but the errors var in func_two scope is unchanged, because it's masked.
The out=var
return style relies on the named var not being accessible at the point of the call. To avoid this clashing, you need to make sure the var you use with out=
is something that won't be used in the inner function call.
If you change func_two like this, it works:
func_two()
{
local -a args
local _f2_errors
args=( ${@} )
out=_f2_errors func_one "${args[@]}"
json errors:json[]??@_f2_errors
}
$ func_one foo:a,bar:b
{"errors":[{"foo":"a","bar":"b"}]}
$ func_two foo:a,bar:b
{"errors":[{"errors":[{"foo":"a","bar":"b"}]}]}
I guess you're not intending to have two layers of "errors"
properties though. If you could share an example of the JSON data layout you're aiming for (and the source data) I could suggest how to adjust the funcs to generate it.
I don't think this out= pattern is a normal bash idiom. It's something I came up with when I was optimising the performance to avoid using a subshell to capture stdout. It's a bit odd, but once you realise it's just using bash's slightly odd dynamic scoping rules that allow functions to modify outer scopes, I think it starts to make more sense.
I really appreciate your responsiveness. The JSON object I am trying to recreate with your project can be found in the previous issue.
The biggest complexity is that the stigs
array can have one or more of the errors
, inspected
& warnings
array of objects. This is where I am running into issues generating the JSON objects from one function and passing to the next. Ultimately the report would be called from one function.
I am really pretty close. Just having some problems with the issues you mentioned about having two layers of "errors"
and other potential objects per STIG module.
The end report ultimately has the following parts;
OK, yes. The approach to take seems to depend on whether the JSON data is being generated as part of the original scanning logic, or as a second pass to represent the output of the scanning process as JSON. I imagine this depends on the architecture you already have in place.
My intuition is that it would be more work to convert the existing output to JSON, as you need to also parse the output before representing it as JSON, whereas if you generate JSON at the source, you'll always have the source data to directly encode. The disadvantage of directly encoding JSON would be that you'd have a fixed output representation. (Although you could use a tool like jq
to do a general transformation afterwards (e.g. if you needed XML, you could use an XSLT 3 processor which can consume JSON and generate XML).
(I say this because it seems like you're more on the side of parsing the output at the moment, e.g. passing data likeElem:1,Item:2,Key:3,Value:4
to the report function.)
Although bash data structures are quite limited, it should be possible to separate concerns of scanning/reporting and encoding as JSON by keeping the report generated by the scanner in memory. If the scanning/reporting phase can encode its results as a combination of bash arrays and associative arrays, a second report-encoding pass could walk the data tree generated by the scanning phase and encode it as JSON.
I've not had a chance to make a practical example, but I'll play with this a bit and get back to you. Unless I'm completely off target!
The current implementation is script a can and does call multiple scripts. Each (including the main script) generates a part of the report.
The called script(s) can have multiple arrays and associative arrays for errors, etc.
It seems to be a scoping issue.
I have the functions written to generate the report for each called script, but when multiple scripts are called only one prints because the objects seem to be globally scoped.
Pretty sure anyways, been working on it but need to do some more debugging to confirm
Sorry, been having a busy week! I thought I'd do an example of generating the whole depth of one of your objects from #11. I've used a kind of careful approach to naming variables with prefixes to avoid naming conflicts at lower levels. e.g. several of the functions use an "entries" var, but named with a different prefix to avoid clashing. It's not so pretty, but it works reliably, this is the pattern the json.bash code itself uses.
It doesn't generate every single property from your example, but includes an example of each different type of nested object I think. I hope it's somewhat similar structure to how you're generating report sections. And hopefully it should serve as an example of how to get around a clash, etc. Also just a few ways of representing data before emitting, e.g. the report_system
using an implicit associative array to hold data. And merging together multiple objects into one in report_stig.V0216321
.
source json.bash
function report_stig.V0216321.validated_files() {
local -a _rvf_entries
out=_rvf_entries json Name=/etc/default/passwd Option=MAXDAYS Expected=56 Current=Missing
json Files:json[]@_rvf_entries
}
function report_stig.V0216321.validated_users() {
local -a _rvu_entries
out=_rvu_entries json Username=root Expected=56 Current=Missing
out=_rvu_entries json Username=user1 Expected=56 Current=Missing
out=_rvu_entries json Username=user2 Expected=56 Current=Missing
json Users:json[]@_rvu_entries
}
function report_stig.V0216321.validated() {
local -a _rv_entries
out=_rv_entries report_stig.V0216321.validated_files
out=_rv_entries report_stig.V0216321.validated_users
json validated:json[]@_rv_entries
}
function report_stig.V0216321() {
local -a _r_entries
out=_r_entries json id=V0216321 \
title="User passwords must be changed at least every 60 days." \
meta:{}="date=26-Jul-2023,rule_id=SV-216321r646926_rule,version=SOL-11.1-040010"
out=_r_entries report_stig.V0216321.validated
out=_r_entries json summary:{}="inspected=4,errors=1,warnings=0,failed_rate=25.00%" \
metrics:{}="start=1721456331,end=1721456344,time=13 Sec."
# This is merging several individual JSON objects in the _r_entries array into
# a single object. Can be useful if you need to build pieces separately.
json ...:json{:json}@_r_entries
}
function report_stigs() {
# do this twice to simulate having multiple
report_stig.V0216321
report_stig.V0216321
}
function report_system() {
# could generate these dynamically
system[hostname]=solaris
system[kernel]=11.4.42.111.0
system[OS]=Solaris
system[version]=11
system[architecture]=i386
}
function report() {
local date=20240720-061533 # placeholder
local -a _r_stigs
local -A system
report_system
out=_r_stigs report_stigs
json @date @system:{} stigs:json[]@_r_stigs
}
$ . example.sh
$ report | jq
{
"date": "20240720-061533",
"system": {
"OS": "Solaris",
"version": "11",
"hostname": "solaris",
"architecture": "i386",
"kernel": "11.4.42.111.0"
},
"stigs": [
{
"id": "V0216321",
"title": "User passwords must be changed at least every 60 days.",
"meta": {
"date": "26-Jul-2023",
"rule_id": "SV-216321r646926_rule",
"version": "SOL-11.1-040010"
},
"validated": [
{
"Files": [
{
"Name": "/etc/default/passwd",
"Option": "MAXDAYS",
"Expected": "56",
"Current": "Missing"
}
]
},
{
"Users": [
{
"Username": "root",
"Expected": "56",
"Current": "Missing"
},
{
"Username": "user1",
"Expected": "56",
"Current": "Missing"
},
{
"Username": "user2",
"Expected": "56",
"Current": "Missing"
}
]
}
],
"summary": {
"inspected": "4",
"errors": "1",
"warnings": "0",
"failed_rate": "25.00%"
},
"metrics": {
"start": "1721456331",
"end": "1721456344",
"time": "13 Sec."
}
},
{
"id": "V0216321",
"title": "User passwords must be changed at least every 60 days.",
"meta": {
"date": "26-Jul-2023",
"rule_id": "SV-216321r646926_rule",
"version": "SOL-11.1-040010"
},
"validated": [
{
"Files": [
{
"Name": "/etc/default/passwd",
"Option": "MAXDAYS",
"Expected": "56",
"Current": "Missing"
}
]
},
{
"Users": [
{
"Username": "root",
"Expected": "56",
"Current": "Missing"
},
{
"Username": "user1",
"Expected": "56",
"Current": "Missing"
},
{
"Username": "user2",
"Expected": "56",
"Current": "Missing"
}
]
}
],
"summary": {
"inspected": "4",
"errors": "1",
"warnings": "0",
"failed_rate": "25.00%"
},
"metrics": {
"start": "1721456331",
"end": "1721456344",
"time": "13 Sec."
}
}
]
}
Thanks for this great example. I keep going back to the documentation and have gotten to a point where I am a bit lost as to what is happening.
I supply the following to function y()
that conditionally passes $1
to function z()
in order to handle an associated array data type. The arg string looks like key@Elem:1,Item:2,Key:3,Value:4+key@Elem:one,Item:two,Key:three,Value:four
The function currently looks like the following:
################################################
# @description Generates object of inspected data; errors, warnings etc.
#
# @args $@ Array Data used to create an array of objects
#
# @example
# $ reporting._gen_stig_objects_associative key@Elem:1,Item:2,Key:3,Value:4+key@Elem:one,Item:two,Key:three,Value:four
# {"key":[{"Elem":"1","Item":"2","Key":"3","Value":"4"},{"Elem":"one","Item":"2","Key":"three","Value":"four"}]}
#
# @stdout string
################################################
reporting._gen_stig_objects_associative()
{
local key obj
local -a keys objs tmp_objs results
objs=( ${@//+/ } )
keys=( $(echo "${objs[@]}" |
tr ' ' '\n' | cut -d"@" -f1 |
sort | uniq) )
for key in ${keys[@]}; do
tmp_objs=( $(echo "${objs[@]//=/^}" |
tr ' ' '\n' | grep "^${key}@" |
cut -d"@" -f2) )
#>&2 echo "1: ${key} -> ${tmp_objs[@]}"
obj="${tmp_objs[@]//:/=}"
#>&2 echo "2: ${key} -> ${obj}"
obj="${obj// /,}"
#>&2 echo "3: ${key} -> ${obj}"
[ "${obj}" != "" ] &&
results+=( $(json "${key}":{}="${obj}") )
done
>&2 echo ${results[@]}
(
[ $(env._get_os_name) = "Solaris" ] &&
json @results:raw[] ||
json @results:json[]
) 2>/dev/null |
sed "s|{\"results\":||g" |
sed "s|]}$|]|g"
}
The output from stderr
looks like this... which is what I want to see returned and correct.
reporting._gen_stig_objects_associative ernel@Option:/dev/mapper/rhel-swap+Kernel@Option:rd.lvm.lv=rhel/root+Kernel@Option:rd.lvm.lv=rhel/swap+Kernel@Option:rhgb+Kernel@Option:quiet+Kernel@Option:+Kernel@Option:fips=1,State:0+Mode@FIPS:completed.
{"Kernel":{"Option":"/dev/mapper/rhel-swap","Option":"rd.lvm.lv^rhel/root","Option":"rd.lvm.lv^rhel/swap","Option":"rhgb","Option":"quiet","Option":"","Option":"fips^1","State":"0"}}
The problem is once it calls json @results:json[]
the results are getting stripped and look like this..
...
"errors": [
{
"Kernel": {
"Option": "fips 1",
"State": "missing"
}
},
{
"Packages": {
"Name": "grub2",
"State": "missing"
}
}
],
"inspected": [
{
"Kernel": {
"Option": "fips 1",
"State": "0"
}
},
{
"Mode": {
"FIPS": "disabled."
}
}
],
...
Not sure what is going on... any debugging tips?
Not sure what is going on... any debugging tips?
I find a good approach is to use print statement debugging with liberal use of declare -p var1 var2 ...
. Start at the beginning of a misbehaving function, insert some declare -p
to check the initial state and add an intentional error to stop execution (e.g. return 99
).
Then run the function interactively in a shell with different inputs to understand what it's doing. Then iteratively advance the error point and declare -p
until you find a point that's not behaving as you expect.
Another approach is to use bats to write tests for functions, and iterate between the implementation and the tests. This is how I developed json.bash itself (see json.bats
— no way I could have built it without writing tests as I wrote each fn.)
On your problem here — I can show you how to parse these +
/ @
delimited strings, but I feel like you'd be able to make your life easier if you could encode as JSON at the source. Presumably you have some code that's already got the split up data and you join it with +
/ @
to make these strings. At that point couldn't you encode it as JSON instead?
Anyway, here's what I came up with to parse these delimited strings and re-encode as JSON. It's rather gnarly, I can't say I'd recommend this if you can avoid parsing these +@ strings.
This is going to break if you have @
/+
in your values by mistake.
$ group_named_objects key@Elem:1,Item:2,Key:3,Value:4+key@Elem:one,Item:two,Key:three,Value:four Files@Name:/etc/default/passwd,Option:MAXDAYS,Expected:56,Current:Missing Users@Username:root,Expected:56,Current:Missing+Files@Name:/etc/foo/bar,Option:MAXDAYS,Expected:12,Current:Missing+Users@Username:user1,Expected:56,Current:Missing Users@Username:user2,Expected:56,Current:Missing | jq
{
"key": [
{
"Elem": "1",
"Item": "2",
"Key": "3",
"Value": "4"
},
{
"Elem": "one",
"Item": "two",
"Key": "three",
"Value": "four"
}
],
"Files": [
{
"Name": "/etc/default/passwd",
"Option": "MAXDAYS",
"Expected": "56",
"Current": "Missing"
},
{
"Name": "/etc/foo/bar",
"Option": "MAXDAYS",
"Expected": "12",
"Current": "Missing"
}
],
"Users": [
{
"Username": "root",
"Expected": "56",
"Current": "Missing"
},
{
"Username": "user1",
"Expected": "56",
"Current": "Missing"
},
{
"Username": "user2",
"Expected": "56",
"Current": "Missing"
}
]
}
The group_named_objects
function:
function group_named_objects() {
local IFS
IFS=+; local groups=($@) # [a@b+c@d e@f+g@h] -> [a@b c@d e@f g@h]
local group_names=("${groups[@]/@*/}") # [a@b c@d e@f g@h] -> [a c e g]
IFS=@; local group_attr_pairs=(${groups[@]}) # [a@b c@d e@f g@h] -> [a b c d ...]
# Map group names (a, c, e, g) to local array var names, like group_array_1, group_array_2 ...
local -A group_arrays=()
let i=0
for group in "${group_names[@]}"; do
# Skip if we've already defined a var for a group name
if [[ "${group_arrays[${group?}]:-}" ]]; then continue; fi
# Declare a local array variable with for each unique group name
group_arrays["${group?}"]="group_array_$((i++))" # generate the name
local -a "${group_arrays["${group?}"]?}" # declare the array using the name
done
# Encode each object as JSON and add it to the array of its group name
for ((i=0; i < ${#group_attr_pairs[@]}; i+=2)); do
local group_name=${group_attr_pairs[((i))]} \
group_attrs=${group_attr_pairs[((i+1))]}
# nameref group_array to point to the group's array var
local -n group_array="${group_arrays["${group_name?}"]}"
# Transform the foo:1,bar:2 syntax into foo=1,bar=2 attributes that
# json.bash can parse.
object_attrs=${group_attr_pairs[((i+1))]//:/=} # replace : with =
out=group_array json ...:{}@object_attrs
done
# Collect groups and encoded JSON arrays
local group_entries=()
for group_name in "${group_names[@]}"; do
# nameref group_array to point to the group's array var
local -n group_array="${group_arrays["${group_name?}"]}"
out=group_entries json @group_name:json[]@group_array
done
json ...:json{:json}@group_entries
}
BTW sorry I didn't update from your example for this, it was easier for me to use bash level arrays/operations rather than command substitution ($()
)!
Well after trying a ton of different methods, the example is what I am using. I really appreciate your help.
I'm glad you got it working! Feel free to get in touch if you have another problem.
I think I may have found a bug. If you look at the
errors
,inspected
&warnings
elements you will see a combination of a string and the expected JSON object in the array's. See the results hereThis happens with both
4.4.20
&5.1.12
versions of bash on both RHEL and Solaris.The calling code example (uses the below functions)
The code looks like the following