Closed gvp-devops closed 1 year ago
Presuming you have multiple objects per file. Otherwise you could just smash everything together:
jq -n 'reduce inputs as $i ({}; . += $i)' File[123]
To cross-reference multiple objects across all your files, you might use an index object where the index is "renamed" (copied and deleted) during the second migration. This disadvantage here is memory footprint.
jq --null-input '
reduce inputs as $i
( {}
; if $i.ID then # form 1
.[$i.ID] = $i
elif $i.id and $i.project_id then # form 2
.[$i.project_id] = .[$i.id] + ($i | {project_id})
| del(.[$i.id])
elif $i.tenant then # form 3
.[$i.project_id] += ($i | {tenant})
else . end
)
| .[] # break down index object into stream of desired objects
' File[123]
@vintnes Thank you for this and so much appreciated. I think I can see where you're going with these. In testing the first to just merge it all together I receive the following error (which has been common through out all jq
runs). One thing I did not think to ask is the format of each JSON file. Each one is formatted as an array with brackets at the beginning and end of each file.
$ jq -n 'reduce inputs as $i ({}; . += $i)' ctc_osp_vmmaster_lc.json (file 1) ctc_osp_projectIDs.json (file 2) ctc_osp_nameid.json (file 3)
jq: error (at ctc_osp_vmmaster_lc.json:7202): object ({}) and array ([{"id":"9a1...) cannot be added
Test run after second migration
$ jq --null-input 'reduce inputs as $i ({}; if $i.id then .[$i.id] = $i elif $i.id and $i.project_id then .[$i.project_id] = .[$i.id] + ($i | {project_id}) | del(.[$i.id]) elif $i.tenant then .[$i.project_id]
+= ($i | {tenant}) else . end) | .[]' master1.json (temp file) ctc_osp_nameid.json (file 3)
jq: error (at master1.json:886): Cannot index array with string "id"
Result of cross-referencing using second jq
code sample almost produced result. Rather it's merging two files together but not merging the tenant key with the actual identity record, as $ jq -s 'flatten | group_by(.project_id) | map(reduce .[] as $x ({}; . * $x))' File_temp File3
has done.
I believe what really need to know are the map options to not remove duplicate key values in order to maintain the inventory integrity.
(...)
{
"id": "ffb801a4-93d9-4df7-82e6-5341f995e113",
"name": "echoserver-workflow-test-b-4",
"status": "active",
"task state": null,
"power state": "running",
"networks": "green-dynamic=2001:558:1047:9d8:f816:3eff:fe0f:515b, xx.xxx.xxx.xxx",
"image name": "centos-7-mvp_echoserver_v1.0.7_2022-04-21 03:08:35",
"image id": "bb27e36a-3461-450f-af83-2bbe4531930d",
"flavor name": "m1.small",
"flavor id": "f874ca8b-c4fa-4dde-9fb6-3b22b3a2b0b2",
"availability zone": "nova",
"host": "ospcomputem-ctc-a311d",
"properties": "group='echoservice-workflow-test'",
"project_id": "21d74337771a43e2b0ac5a91db828898"
}
{
"tenant": "wsv-gcs-qa"
}
(...)
inputs[]
@vintnes Spent some time to work on this and unable for jq to merge three JSON file arrays without having to use unique feature and remove all duplicate project_id
records. As a result removes several inventory records. Each iteration of the suggested commands is peculiar when working with them as they produce two different results.
$ jq -n 'reduce inputs[] as $i ({}; . += $i)' ctc_osp_vmmaster_lc.json (file 1) ctc_osp_projectIDs.json (file 2) ctc_osp_nameid.json (file 3)
{
"id": "74a7cb34-1510-4cb6-aaa5-3e97ef13d89e",
"name": "bgf01_vnf_stack-0",
"status": "active",
"task state": null,
"power state": "running",
"networks": "bgf01_vnf_stack_internal=xxx.xxx.0.10,
"image name": "vbgf01_28",
"image id": "54c9fc86-3331-4e45-972b-67fe3e0fba3c",
"flavor name": "d1.vbgf01_flavor",
"flavor id": "1aaed288-7867-4880-b4b2-7c0f8038ad86",
"availability zone": "nova",
"host": "ospcomputed2-ctc-a610d",
"properties": "ha-policy='ha-offline'",
"project_id": "116408287003490c87817885841753c3",
"tenant": "ims-vbgf-dev"
}
jq
run. Although it did "merge" the latter two files it did not merge as a unique record index.
$ jq --null-input 'reduce inputs[] as $i( {}; if $i.id then .[$i.id] = $i elif $i.id and $i.project_id then .[$i.project_id] = .[$i.id] + ($i | {project_
id}) elif $i.tenant then .[$i.project_id] += ($i | {tenant}) else . end) | .[]' ctc_osp_vmmaster_lc.json ctc_osp_projectIDs.json ctc_osp_nameid.json
(...)
{
"id": "9a120e32-b001-432a-a1ae-59a297969126",
"project_id": "ed1a52f2e5664c4f90806214bcb98f58"
}
{
"id": "1eebdef1-8fd8-424b-a15f-f98499a27fff",
"project_id": "923d8bca286d470592d467ca8a130fba"
}
{
"id": "9a9ea330-4af9-4d38-8c5e-7c505957b962",
"project_id": "923d8bca286d470592d467ca8a130fba"
{
"tenant": "wsv-gcs-qa"
}
{
"tenant": "SDWAN-DSE-VIVA-CTC-C1"
}
{
"tenant": "SDWAN-DSE-VIVA-A1"
(...)
Wow.
Your initial description of your source data was inadequate. The first solution (the simple reduce) was, as I explicitly stated, only for a situation with one object per file.
Your demo of the second solution (the conditional index object) is not the code I proposed. It looks like you tried to rewrite my solution by hand, but you lost some case sensitivity (.ID
vs .id
) and the del()
statement.
Wow.
Your initial description of your source data was inadequate. The first solution (the simple reduce) was, as I explicitly stated, only for a situation with one object per file.
Your demo of the second solution (the conditional index object) is not the code I proposed. It looks like you tried to rewrite my solution by hand, but you lost some case sensitivity (
.ID
vs.id
) and thedel()
statement.
You're correct and apologize for the rewrite. I did use the code as originally provided however, there was one file I used where the keys were all lowercase thinking that may have been an issue based on one "comment" I was provided elsewhere.
Have been perplexed to getting my head around merging three JSON files by two different keys in order to provide one complete JSON file without removing duplicate keys. My thought process is this needs to be done by two different
jq
commands.However when I run the same command but change
id
toproject_id
, that almost works but what happens is when there are multiple tenants owners that own multiple ID instances, the final result remove allID
andproject_id
key duplicates and only provides one result per tenant owner. I know it has to do with themap(reduce...
filter. The end result is not to remove any duplicate values since a single tenant can own multiple instance IDs.Have scrubbed multiple JQ docs and StackOverflow but no reference to merge three JSON files, and what options are preferable to use
map
filter.NOTE: In reference to JSON format, this is the JSON output format when using Openstack CLI
Instance File 1
Project ID File 2
Tenant Owner File 3
Desired final merged JSON