Closed gleesonpj closed 4 weeks ago
I just installed the latest fluent-bit; issue remains.
Inputs of fluent-bit will read the provided material line by line.
By using the file GitHub-CMS-in-network-ffs-sample.json
you've linked as an
input, each line will be read as it is. Since it contains "formatted" (or
prettyfied) json, fluent-bit will not consider it as a json content:
"reporting_entity_name": "medicare",
is not a valid json structure.
You have two choices to deal with that kind of content.
The first one is to flatten your content with one structure per line. For
instance, the structure in the file GitHub-CMS-in-network-ffs-sample.json
once flatten looks like this:
input-file:
{ "reporting_entity_name": "medicare", "reporting_entity_type": "medicare", "reporting_plans": [{ "plan_name": "medicaid", "plan_id_type": "hios", "plan_id": "11111111111", "plan_market_type": "individual" },{ "plan_name": "medicare", "plan_id_type": "hios", "plan_id": "0000000000", "plan_market_type": "individual" }], "last_updated_on": "2020-08-27", "version": "1.0.0", "in_network": [{ "negotiation_arrangement": "ffs", "name": "Knee Replacement", "billing_code_type": "CPT", "billing_code_type_version": "2020", "billing_code": "27447", "description": "Arthroplasty, knee condyle and plateau, medial and lateral compartments", "negotiated_rates": [{ "provider_groups": [{ "npi":[1111111111, 2222222222, 3333333333, 4444444444, 5555555555], "tin":{ "type": "ein", "value": "11-1111111" } },{ "npi": [1111111111, 2222222222, 3333333333, 4444444444, 5555555555], "tin":{ "type": "ein", "value": "22-2222222" } }], "negotiated_prices": [{ "negotiated_type": "negotiated", "negotiated_rate": 123.45, "expiration_date": "2022-01-01", "service_code": ["18", "19", "11"], "billing_class": "professional" },{ "negotiated_type": "negotiated", "negotiated_rate": 1230.45, "expiration_date": "2022-01-01", "billing_class": "institutional" }] },{ "provider_groups": [{ "npi": [6666666666, 7777777777, 8888888888, 9999999999], "tin":{ "type": "ein", "value": "22-2222222" } }], "negotiated_prices": [{ "negotiated_type": "negotiated", "negotiated_rate": 120.45, "expiration_date": "2022-01-01", "service_code": ["05", "06", "07"], "billing_class": "professional" }] }] },{ "negotiation_arrangement": "ffs", "name": "Femur and Knee Joint Repair", "billing_code_type": "CPT", "billing_code_type_version": "2020", "billing_code": "27448", "description": "Under Repair, Revision, and/or Reconstruction Procedures on the Femur (Thigh Region) and Knee Joint", "negotiated_rates": [{ "provider_groups": [{ "npi": [1111111111, 2222222222, 3333333333, 4444444444, 5555555555], "tin":{ "type": "ein", "value": "11-1111111" } },{ "npi": [1111111111, 2222222222, 3333333333, 4444444444, 5555555555], "tin":{ "type": "ein", "value": "22-2222222" } }], "negotiated_prices": [{ "negotiated_type": "negotiated", "negotiated_rate": 12003.45, "expiration_date": "2022-01-01", "service_code": ["18", "19", "11"], "billing_class": "professional" }] },{ "provider_groups": [{ "npi": [6666666666], "tin":{ "type": "npi", "value": "6666666666" } }], "negotiated_prices": [{ "negotiated_type": "negotiated", "negotiated_rate": 12.45, "expiration_date": "2022-01-01", "service_code": ["18", "19", "11"], "billing_class": "institutional" }] }] }] }
fluent-bit.conf:
[SERVICE]
flush 1
log_level info
parsers_file ./parsers.conf
[INPUT]
name tail
tag tail
read_from_head true
refresh_interval 1
skip_long_lines off
path GitHub-CMS-in-network-ffs-sample.FLATTENED.json
parser json
[OUTPUT]
name stdout
match *
format json_lines
json_date_key @timestamp
json_date_format iso8601
Result:
Fluent Bit v3.1.3
* Copyright (C) 2015-2024 The Fluent Bit Authors
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io
______ _ _ ______ _ _ _____ __
| ___| | | | | ___ (_) | |____ |/ |
| |_ | |_ _ ___ _ __ | |_ | |_/ /_| |_ __ __ / /`| |
| _| | | | | |/ _ \ '_ \| __| | ___ \ | __| \ \ / / \ \ | |
| | | | |_| | __/ | | | |_ | |_/ / | |_ \ V /.___/ /_| |_
\_| |_|\__,_|\___|_| |_|\__| \____/|_|\__| \_/ \____(_)___/
[2024/07/18 22:47:13] [ info] [fluent bit] version=3.1.3, commit=, pid=208899
[2024/07/18 22:47:13] [ info] [storage] ver=1.5.2, type=memory, sync=normal, checksum=off, max_chunks_up=128
[2024/07/18 22:47:13] [ info] [cmetrics] version=0.9.1
[2024/07/18 22:47:13] [ info] [ctraces ] version=0.5.2
[2024/07/18 22:47:13] [ info] [input:tail:tail.0] initializing
[2024/07/18 22:47:13] [ info] [input:tail:tail.0] storage_strategy='memory' (memory only)
[2024/07/18 22:47:13] [ info] [sp] stream processor started
[2024/07/18 22:47:13] [ info] [input:tail:tail.0] inotify_fs_add(): inode=13265153 watch_fd=1 name=GitHub-CMS-in-network-ffs-sample.json
[2024/07/18 22:47:13] [ info] [output:stdout:stdout.0] worker #0 started
[2024/07/18 22:47:30] [ info] [input:tail:tail.0] inode=13265153 handle rotation(): GitHub-CMS-in-network-ffs-sample.json => /home/pmalamy/tmp/fluent/GitHub-CMS-in-network-ffs-sample.json~
[2024/07/18 22:47:30] [ info] [input:tail:tail.0] inotify_fs_remove(): inode=13265153 watch_fd=1
[2024/07/18 22:47:30] [ info] [input:tail:tail.0] inotify_fs_add(): inode=13265153 watch_fd=2 name=/home/pmalamy/tmp/fluent/GitHub-CMS-in-network-ffs-sample.json~
[2024/07/18 22:47:30] [ info] [input:tail:tail.0] inotify_fs_add(): inode=13265148 watch_fd=3 name=GitHub-CMS-in-network-ffs-sample.json
[2024/07/18 22:47:30] [ info] [input:tail:tail.0] inotify_fs_remove(): inode=13265153 watch_fd=2
{"@timestamp":"2024-07-18T20:47:30.723476Z","reporting_entity_name":"medicare","reporting_entity_type":"medicare","reporting_plans":[{"plan_name":"medicaid","plan_id_type":"hios","plan_id":"11111111111","plan_market_type":"individual"},{"plan_name":"medicare","plan_id_type":"hios","plan_id":"0000000000","plan_market_type":"individual"}],"last_updated_on":"2020-08-27","version":"1.0.0","in_network":[{"negotiation_arrangement":"ffs","name":"Knee Replacement","billing_code_type":"CPT","billing_code_type_version":"2020","billing_code":"27447","description":"Arthroplasty, knee condyle and plateau, medial and lateral compartments","negotiated_rates":[{"provider_groups":[{"npi":[1111111111,2222222222,3333333333,4444444444,5555555555],"tin":{"type":"ein","value":"11-1111111"}},{"npi":[1111111111,2222222222,3333333333,4444444444,5555555555],"tin":{"type":"ein","value":"22-2222222"}}],"negotiated_prices":[{"negotiated_type":"negotiated","negotiated_rate":123.45,"expiration_date":"2022-01-01","service_code":["18","19","11"],"billing_class":"professional"},{"negotiated_type":"negotiated","negotiated_rate":1230.45,"expiration_date":"2022-01-01","billing_class":"institutional"}]},{"provider_groups":[{"npi":[6666666666,7777777777,8888888888,9999999999],"tin":{"type":"ein","value":"22-2222222"}}],"negotiated_prices":[{"negotiated_type":"negotiated","negotiated_rate":120.45,"expiration_date":"2022-01-01","service_code":["05","06","07"],"billing_class":"professional"}]}]},{"negotiation_arrangement":"ffs","name":"Femur and Knee Joint Repair","billing_code_type":"CPT","billing_code_type_version":"2020","billing_code":"27448","description":"Under Repair, Revision, and/or Reconstruction Procedures on the Femur (Thigh Region) and Knee Joint","negotiated_rates":[{"provider_groups":[{"npi":[1111111111,2222222222,3333333333,4444444444,5555555555],"tin":{"type":"ein","value":"11-1111111"}},{"npi":[1111111111,2222222222,3333333333,4444444444,5555555555],"tin":{"type":"ein","value":"22-2222222"}}],"negotiated_prices":[{"negotiated_type":"negotiated","negotiated_rate":12003.45,"expiration_date":"2022-01-01","service_code":["18","19","11"],"billing_class":"professional"}]},{"provider_groups":[{"npi":[6666666666],"tin":{"type":"npi","value":"6666666666"}}],"negotiated_prices":[{"negotiated_type":"negotiated","negotiated_rate":12.45,"expiration_date":"2022-01-01","service_code":["18","19","11"],"billing_class":"institutional"}]}]}]}
The second solution would be to use a multi-line parser enabling to parse pretty-fied json file.
parsers.conf:
[MULTILINE_PARSER]
name formatted-json
type regex
rule "start_state" "/^{/" "message1"
rule "message1,message2" "/^\s+/" "message2"
rule "message2" "/^}/" "message1"
fluent-bit.conf:
[SERVICE]
flush 1
log_level info
parsers_file ./parsers.conf
[INPUT]
name tail
tag tail
read_from_head true
refresh_interval 1
skip_long_lines off
path GitHub-CMS-in-network-ffs-sample.json
key log
[FILTER]
name multiline
match *
buffer on
multiline.key_content log
multiline.parser formatted-json
[FILTER]
name parser
match *
parser json
key_name log
[OUTPUT]
name stdout
match *
format json_lines
json_date_key @timestamp
json_date_format iso8601
Result:
Fluent Bit v3.1.3
* Copyright (C) 2015-2024 The Fluent Bit Authors
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io
______ _ _ ______ _ _ _____ __
| ___| | | | | ___ (_) | |____ |/ |
| |_ | |_ _ ___ _ __ | |_ | |_/ /_| |_ __ __ / /`| |
| _| | | | | |/ _ \ '_ \| __| | ___ \ | __| \ \ / / \ \ | |
| | | | |_| | __/ | | | |_ | |_/ / | |_ \ V /.___/ /_| |_
\_| |_|\__,_|\___|_| |_|\__| \____/|_|\__| \_/ \____(_)___/
[2024/07/18 22:52:36] [ info] [fluent bit] version=3.1.3, commit=, pid=209278
[2024/07/18 22:52:36] [ info] [storage] ver=1.5.2, type=memory, sync=normal, checksum=off, max_chunks_up=128
[2024/07/18 22:52:36] [ info] [cmetrics] version=0.9.1
[2024/07/18 22:52:36] [ info] [ctraces ] version=0.5.2
[2024/07/18 22:52:36] [ info] [input:tail:tail.0] initializing
[2024/07/18 22:52:36] [ info] [input:tail:tail.0] storage_strategy='memory' (memory only)
[2024/07/18 22:52:36] [ info] [filter:multiline:multiline.0] created emitter: emitter_for_multiline.0
[2024/07/18 22:52:36] [ info] [input:emitter:emitter_for_multiline.0] initializing
[2024/07/18 22:52:36] [ info] [input:emitter:emitter_for_multiline.0] storage_strategy='memory' (memory only)
[2024/07/18 22:52:36] [ info] [sp] stream processor started
[2024/07/18 22:52:36] [ info] [input:tail:tail.0] inotify_fs_add(): inode=13265153 watch_fd=1 name=GitHub-CMS-in-network-ffs-sample.json
[2024/07/18 22:52:36] [ info] [output:stdout:stdout.0] worker #0 started
[2024/07/18 22:52:38] [ info] [input:tail:tail.0] inode=13265153 handle rotation(): GitHub-CMS-in-network-ffs-sample.json => /home/pmalamy/tmp/fluent/GitHub-CMS-in-network-ffs-sample.json~
[2024/07/18 22:52:38] [ info] [input:tail:tail.0] inotify_fs_remove(): inode=13265153 watch_fd=1
[2024/07/18 22:52:38] [ info] [input:tail:tail.0] inotify_fs_add(): inode=13265153 watch_fd=2 name=/home/pmalamy/tmp/fluent/GitHub-CMS-in-network-ffs-sample.json~
[2024/07/18 22:52:38] [ info] [input:tail:tail.0] inotify_fs_add(): inode=13258289 watch_fd=3 name=GitHub-CMS-in-network-ffs-sample.json
[2024/07/18 22:52:38] [ info] [filter:multiline:multiline.0] created new multiline stream for tail.0_tail
[2024/07/18 22:52:38] [ info] [input:tail:tail.0] inotify_fs_remove(): inode=13265153 watch_fd=2
{"@timestamp":"2024-07-18T20:52:38.677660Z","reporting_entity_name":"medicare","reporting_entity_type":"medicare","reporting_plans":[{"plan_name":"medicaid","plan_id_type":"hios","plan_id":"11111111111","plan_market_type":"individual"},{"plan_name":"medicare","plan_id_type":"hios","plan_id":"0000000000","plan_market_type":"individual"}],"last_updated_on":"2020-08-27","version":"1.0.0","in_network":[{"negotiation_arrangement":"ffs","name":"Knee Replacement","billing_code_type":"CPT","billing_code_type_version":"2020","billing_code":"27447","description":"Arthroplasty, knee condyle and plateau, medial and lateral compartments","negotiated_rates":[{"provider_groups":[{"npi":[1111111111,2222222222,3333333333,4444444444,5555555555],"tin":{"type":"ein","value":"11-1111111"}},{"npi":[1111111111,2222222222,3333333333,4444444444,5555555555],"tin":{"type":"ein","value":"22-2222222"}}],"negotiated_prices":[{"negotiated_type":"negotiated","negotiated_rate":123.45,"expiration_date":"2022-01-01","service_code":["18","19","11"],"billing_class":"professional"},{"negotiated_type":"negotiated","negotiated_rate":1230.45,"expiration_date":"2022-01-01","billing_class":"institutional"}]},{"provider_groups":[{"npi":[6666666666,7777777777,8888888888,9999999999],"tin":{"type":"ein","value":"22-2222222"}}],"negotiated_prices":[{"negotiated_type":"negotiated","negotiated_rate":120.45,"expiration_date":"2022-01-01","service_code":["05","06","07"],"billing_class":"professional"}]}]},{"negotiation_arrangement":"ffs","name":"Femur and Knee Joint Repair","billing_code_type":"CPT","billing_code_type_version":"2020","billing_code":"27448","description":"Under Repair, Revision, and/or Reconstruction Procedures on the Femur (Thigh Region) and Knee Joint","negotiated_rates":[{"provider_groups":[{"npi":[1111111111,2222222222,3333333333,4444444444,5555555555],"tin":{"type":"ein","value":"11-1111111"}},{"npi":[1111111111,2222222222,3333333333,4444444444,5555555555],"tin":{"type":"ein","value":"22-2222222"}}],"negotiated_prices":[{"negotiated_type":"negotiated","negotiated_rate":12003.45,"expiration_date":"2022-01-01","service_code":["18","19","11"],"billing_class":"professional"}]},{"provider_groups":[{"npi":[6666666666],"tin":{"type":"npi","value":"6666666666"}}],"negotiated_prices":[{"negotiated_type":"negotiated","negotiated_rate":12.45,"expiration_date":"2022-01-01","service_code":["18","19","11"],"billing_class":"institutional"}]}]}]}
In my opinion, flattening the json structures should be the way to go. Multi-line parsing could impact the performances, and the regular expression rules might be tedious to build in order to cover all the cases.
This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 5 days. Maintainers can add the exempt-stale
label.
This issue was closed because it has been stalled for 5 days with no activity.
Bug Report
Describe the bug JSON input via Tail appears to be processed as unstructured instead of JSON, keys, or values. The output begins with "Log" and contains each JSON line as an unstructured value. Using the Expect filter confirms that FluentBit does not see the JSON keys in the stream.
To Reproduce Conf file:
Parsers.conf extract
Input Extract:
GitHub-CMS-in-network-ffs-sample.json
Example log message:
Extract of Output
CPT1.txt
Expected behavior We expect that Fluent-Bit will filter out all keys before "in_network" and simply display the values/nested values of this key. If an "Expect" filter is used AFTER record_modify, it would not detect "reporting_entity_name" or other keys before "in_network" but it WOULD detect "in_network"
Instead: -All keys/values are listed, starting with "reporting_entity_name" -Output is in Tail's "Log" format for unstructured messages -Adding Expect to conf does not detect any keys, including "in_network"
Rough Expected Output (Not fully formatted, but a lot shorter)
Your Environment
Additional context Note: We tested the install using the "mem.local" example from the Manual and properly nested winstat keys: "mem.local: [1718506661.988717100, {"CPUstats":{"user":2968750,"idle":116250000,"kernel":1093750,"utilization":3.376623392105103}}]" This suggests an issue with how Tail is parsing the input in our setup.