kellyjonbrazil / jc

CLI tool and python library that converts the output of popular command-line tools, file-types, and common strings to JSON, YAML, or Dictionaries. This allows piping of output to tools like jq and simplifying automation scripts.
MIT License
7.79k stars 196 forks source link

Multiline Input #506

Closed muescha closed 7 months ago

muescha commented 8 months ago

For some use cases I see there only one line are as parameter, for example for the --url.

It would be nice if is possible to add an option for example --multiline-input to parse it by line and join the output into one json.

example - we have this data:

echo "/abc/def:/efg/ghi" | tr ":" "\n"
/abc/def
/efg/ghi

this would be nice to have as an input as example for --url

current behaviour:

echo "/abc/def:/efg/ghi" | tr ":" "\n" | jc --url | jq '{path,path_list}'
{
  "path": "/abc/def/efg/ghi",
  "path_list": [
    "abc",
    "def",
    "efg",
    "ghi"
  ]
}
LC_ALL=C echo "Thu Dec 21 00:19:35 CET 2023\nThu Dec 21 00:20:35 CET 2023" \
| jc --date -p
jc:  Error - date parser could not parse the input data.
             If this is the correct parser, try setting the locale to C (LC_ALL=C).
             For details use the -d or -dd option. Use "jc -h --date" for help.

expected behaviour:

echo "/abc/def:/efg/ghi" | tr ":" "\n" | jc --url --multiline | jq '[.[] | {path, path_list}]'
[
    {
      "path": "/abc/def",
      "path_list": [
        "abc",
        "def"
      ]
    },
    {
      "path": "/efg/ghi",
      "path_list": [
        "efg",
        "ghi"
      ]
    }
]
LC_ALL=C echo "Thu Dec 21 00:19:35 CET 2023\nThu Dec 21 00:20:35 CET 2023" \
| jc --date --multiline |  jq '[.[] | {minute}]'
[
  {
    "minute": 19
  },
  {
    "minute": 20
  }
]

PS: the long version would be:

paths=()
echo "/abc/def:/efg/ghi" | tr ":" "\n" | while IFS= read -r line; do
    jc_command_output=$(echo "$line" | jc --url )
    paths+=("$jc_command_output")
done

echo "${paths[@]}" | jq -s . | jq '[.[] | {path, path_list}]'
output=()
LC_ALL=C echo "Thu Dec 21 00:19:35 CET 2023\nThu Dec 21 00:20:35 CET 2023" | while IFS= read -r line; do
    jc_command_output=$(echo "$line" | jc --date )
    output+=("$jc_command_output")
done

echo "${output[@]}" | jq -s . | jq '[.[] | {minute}]'

Possible candidates:

(all with tag string, but not with file and generic)

muescha commented 8 months ago

maybe also with input json?

echo "/abc/def:/efg/ghi fgh/mn" | jq -R 'split(":")'
[
  "/abc/def",
  "/efg/ghi fgh/mn"
]
muescha commented 8 months ago

forget the json input - i can convert it easy into a multiline input with jq:

echo "/abc/def:/efg/ghi fgh/mn" | jq -R 'split(":")[]'
"/abc/def"
"/efg/ghi fgh/mn"

or without quotes:

echo "/abc/def:/efg/ghi fgh/mn" | jq -R 'split(":")[]' -r
/abc/def
/efg/ghi fgh/mn
kellyjonbrazil commented 8 months ago

I think this is good idea. The easiest way to implement this in jc would actually be to create additional mult-line parsers for each of those (e.g. url-multi). These parsers would just iterate over the lines and call their parent parsers.

It's a little more difficult to create additional arguments to send to the parsers, unless they are ENV variables, because the parse() function is pretty static and only takes 3 arguments for standard parsers and 4 arguments for streaming parsers.

muescha commented 8 months ago

and I think to introduces an 4th/5th argument for an generic options dictonary would be an overkill?

muescha commented 8 months ago

but I think this can be done in the cli like the slice ( #341 )?

kellyjonbrazil commented 8 months ago

I have thought about that but hadn't had much pressure for more arguments so I haven't spent much time on it. I think it might be interesting to have a kwargs type of argument you can pass to the parsers so they can have their own specific arguments.

It can be done - I just need to make sure it doesn't break anything from a backward compatibility standpoint or with how jc is used as a library (e.g. Ansible)

kellyjonbrazil commented 8 months ago

Ah yes, maybe since this is just iterating over a parser I could set up a jc argument that doesn't need to be passed to the parser and just iterates over it and puts the values into an array. I got a similar request for the proc parser so it could iterate over multiple files when a glob is used in the magic syntax (https://github.com/kellyjonbrazil/jc/issues/389)

muescha commented 8 months ago

Yes, the multiline feature could function as a preprocessor.

Perhaps some processor could possess a boolean attribute, such as 'multiline=true', which could be examined to determine if this option has been provided. This check could help the parser assess the feasibility of incorporating this feature.

kellyjonbrazil commented 8 months ago

I have a working version in the dev branch. I have called this option --slurp or -s.

https://github.com/kellyjonbrazil/jc/tree/55bc91a6e43b32b0268f264b783deaf3271573eb

Here is an example with a list of URLs (one per line)

% cat urls.txt | jc --slurp --url -p
[
  {
    "url": "http://www.google.com",
    "scheme": "http",
    "netloc": "www.google.com",
    "path": null,
    "parent": null,
    "filename": null,
    "stem": null,
    "extension": null,
    "path_list": null,
    "query": null,
    "query_obj": null,
    "fragment": null,
    "username": null,
    "password": null,
    "hostname": "www.google.com",
    "port": null,
    "encoded": {
      "url": "http://www.google.com",
      "scheme": "http",
      "netloc": "www.google.com",
      "path": null,
      "parent": null,
      "filename": null,
      "stem": null,
      "extension": null,
      "path_list": null,
      "query": null,
      "fragment": null,
      "username": null,
      "password": null,
      "hostname": "www.google.com",
      "port": null
    },
    "decoded": {
      "url": "http://www.google.com",
      "scheme": "http",
      "netloc": "www.google.com",
      "path": null,
      "parent": null,
      "filename": null,
      "stem": null,
      "extension": null,
      "path_list": null,
      "query": null,
      "fragment": null,
      "username": null,
      "password": null,
      "hostname": "www.google.com",
      "port": null
    }
  },
  {
    "url": "https://www.kelly.com/testing",
    "scheme": "https",
    "netloc": "www.kelly.com",
    "path": "/testing",
    "parent": "/",
    "filename": "testing",
    "stem": "testing",
    "extension": null,
    "path_list": [
      "testing"
    ],
    "query": null,
    "query_obj": null,
    "fragment": null,
    "username": null,
    "password": null,
    "hostname": "www.kelly.com",
    "port": null,
    "encoded": {
      "url": "https://www.kelly.com/testing",
      "scheme": "https",
      "netloc": "www.kelly.com",
      "path": "/testing",
      "parent": "/",
      "filename": "testing",
      "stem": "testing",
      "extension": null,
      "path_list": [
        "testing"
      ],
      "query": null,
      "fragment": null,
      "username": null,
      "password": null,
      "hostname": "www.kelly.com",
      "port": null
    },
    "decoded": {
      "url": "https://www.kelly.com/testing",
      "scheme": "https",
      "netloc": "www.kelly.com",
      "path": "/testing",
      "parent": "/",
      "filename": "testing",
      "stem": "testing",
      "extension": null,
      "path_list": [
        "testing"
      ],
      "query": null,
      "fragment": null,
      "username": null,
      "password": null,
      "hostname": "www.kelly.com",
      "port": null
    }
  },
  {
    "url": "https://mail.apple.com",
    "scheme": "https",
    "netloc": "mail.apple.com",
    "path": null,
    "parent": null,
    "filename": null,
    "stem": null,
    "extension": null,
    "path_list": null,
    "query": null,
    "query_obj": null,
    "fragment": null,
    "username": null,
    "password": null,
    "hostname": "mail.apple.com",
    "port": null,
    "encoded": {
      "url": "https://mail.apple.com",
      "scheme": "https",
      "netloc": "mail.apple.com",
      "path": null,
      "parent": null,
      "filename": null,
      "stem": null,
      "extension": null,
      "path_list": null,
      "query": null,
      "fragment": null,
      "username": null,
      "password": null,
      "hostname": "mail.apple.com",
      "port": null
    },
    "decoded": {
      "url": "https://mail.apple.com",
      "scheme": "https",
      "netloc": "mail.apple.com",
      "path": null,
      "parent": null,
      "filename": null,
      "stem": null,
      "extension": null,
      "path_list": null,
      "query": null,
      "fragment": null,
      "username": null,
      "password": null,
      "hostname": "mail.apple.com",
      "port": null
    }
  }
]

The documentation has been updated to show which parsers are compatible. Compatible parsers accept a single line of input. They are identified with the "slurpable" tag:

% jc -a | jq '.parsers[] | select(.name == "url")'
{
  "name": "url",
  "argument": "--url",
  "version": "1.2",
  "description": "URL string parser",
  "author": "Kelly Brazil",
  "author_email": "kellyjonbrazil@gmail.com",
  "compatible": [
    "linux",
    "darwin",
    "cygwin",
    "win32",
    "aix",
    "freebsd"
  ],
  "tags": [
    "standard",
    "string",
    "slurpable"
  ]
}

These can also be found with jc -hhh:

% jc -hhh
Generic Parsers:  (5)
--asciitable          ASCII and Unicode table parser
--asciitable-m        multi-line ASCII and Unicode table parser
--kv                  Key/Value file and string parser
<snip>

Slurpable Parsers:  (9)
--date                `date` command parser
--datetime-iso        ISO 8601 Datetime string parser
--email-address       Email Address string parser
--ip-address          IPv4 and IPv6 Address string parser
--jwt                 JWT string parser
--semver              Semantic Version string parser
--timestamp           Unix Epoch Timestamp string parser
--url                 URL string parser
--ver                 Version string parser

Streaming Parsers:  (15)
--cef-s               CEF string streaming parser
--clf-s               Common and Combined Log Format file streaming parser
--csv-s               CSV file streaming parser
<snip>
muescha commented 8 months ago

slurp is working fine :)

echo "/abc/def:/efg/ghi" | tr ":" "\n" | jc --url -s | jq '[.[] | {path, path_list}]'
[
  {
    "path": "/abc/def",
    "path_list": [
      "abc",
      "def"
    ]
  },
  {
    "path": "/efg/ghi",
    "path_list": [
      "efg",
      "ghi"
    ]
  }
]
kellyjonbrazil commented 7 months ago

I'm rethinking the slurp output and it might make sense to use a dictionary for both types of slurp (multiple lines to a slurpable parser and multiple /proc files with magic syntax):

{<identifier>: <parsed-output>}

The <identifier> is the input string when slurping string lines. The <identifier> is the filename when slurping multiple files from /proc magic syntax.

This makes the output more consistent and also ensures you can identify which input corresponds to which output. You can still reference the nth output if you don't care about the key name in jq by using the keys_unsorted[n] syntax. For example, to grab the 3rd object without caring about the key name:

% cat uname.txt | jc --slurp --uname | jq '.[keys_unsorted[2]]'
{
  "machine": "x86_66",
  "kernel_name": "Darwin",
  "node_name": "Kellys-MBP.attlocal.net",
  "kernel_release": "22.6.0",
  "kernel_version": "Darwin Kernel Version 22.6.0: Wed Oct 4 21:25:26 PDT 2023; root:xnu-8796.141.3.701.17~4/RELEASE_X86_64"
}

This is what the output looks like without filtering:

Single-line slurpable parsers:

% cat uname.txt | jc --slurp --uname -p
{
  "Darwin Kellys-MBP.attlocal.net 22.6.0 Darwin Kernel Version 22.6.0: Wed Oct  4 21:25:26 PDT 2023; root:xnu-8796.141.3.701.17~4/RELEASE_X86_64 x86_64": {
    "machine": "x86_64",
    "kernel_name": "Darwin",
    "node_name": "Kellys-MBP.attlocal.net",
    "kernel_release": "22.6.0",
    "kernel_version": "Darwin Kernel Version 22.6.0: Wed Oct 4 21:25:26 PDT 2023; root:xnu-8796.141.3.701.17~4/RELEASE_X86_64"
  },
  "Darwin Kellys-MBP.attlocal.net 22.6.0 Darwin Kernel Version 22.6.0: Wed Oct  4 21:25:26 PDT 2023; root:xnu-8796.141.3.701.17~4/RELEASE_X86_64 x86_65": {
    "machine": "x86_65",
    "kernel_name": "Darwin",
    "node_name": "Kellys-MBP.attlocal.net",
    "kernel_release": "22.6.0",
    "kernel_version": "Darwin Kernel Version 22.6.0: Wed Oct 4 21:25:26 PDT 2023; root:xnu-8796.141.3.701.17~4/RELEASE_X86_64"
  },
  "Darwin Kellys-MBP.attlocal.net 22.6.0 Darwin Kernel Version 22.6.0: Wed Oct  4 21:25:26 PDT 2023; root:xnu-8796.141.3.701.17~4/RELEASE_X86_64 x86_66": {
    "machine": "x86_66",
    "kernel_name": "Darwin",
    "node_name": "Kellys-MBP.attlocal.net",
    "kernel_release": "22.6.0",
    "kernel_version": "Darwin Kernel Version 22.6.0: Wed Oct 4 21:25:26 PDT 2023; root:xnu-8796.141.3.701.17~4/RELEASE_X86_64"
  },
  "Darwin Kellys-MBP.attlocal.net 22.6.0 Darwin Kernel Version 22.6.0: Wed Oct  4 21:25:26 PDT 2023; root:xnu-8796.141.3.701.17~4/RELEASE_X86_64 x86_67": {
    "machine": "x86_67",
    "kernel_name": "Darwin",
    "node_name": "Kellys-MBP.attlocal.net",
    "kernel_release": "22.6.0",
    "kernel_version": "Darwin Kernel Version 22.6.0: Wed Oct 4 21:25:26 PDT 2023; root:xnu-8796.141.3.701.17~4/RELEASE_X86_64"
  },
  "Darwin Kellys-MBP.attlocal.net 22.6.0 Darwin Kernel Version 22.6.0: Wed Oct  4 21:25:26 PDT 2023; root:xnu-8796.141.3.701.17~4/RELEASE_X86_64 x86_68": {
    "machine": "x86_68",
    "kernel_name": "Darwin",
    "node_name": "Kellys-MBP.attlocal.net",
    "kernel_release": "22.6.0",
    "kernel_version": "Darwin Kernel Version 22.6.0: Wed Oct 4 21:25:26 PDT 2023; root:xnu-8796.141.3.701.17~4/RELEASE_X86_64"
  }
}

Multiple /proc files:

% jc -p /proc/stat /proc/cpuinfo
{
  "/proc/stat": {
    "cpu": {
      "user": 6002,
      "nice": 152,
      "system": 8398,
      "idle": 3444436,
      "iowait": 448,
      "irq": 0,
      "softirq": 1174,
      "steal": 0,
      "guest": 0,
      "guest_nice": 0
    },
    "cpu0": {
      "user": 2784,
      "nice": 137,
      "system": 4367,
      "idle": 1732802,
      "iowait": 225,
      "irq": 0,
      "softirq": 221,
      "steal": 0,
      "guest": 0,
      "guest_nice": 0
    },
    "cpu1": {
      "user": 3218,
      "nice": 15,
      "system": 4031,
      "idle": 1711634,
      "iowait": 223,
      "irq": 0,
      "softirq": 953,
      "steal": 0,
      "guest": 0,
      "guest_nice": 0
    },
    "interrupts": [
      2496709,
      <snip>
      0
    ],
    "context_switches": 4622716,
    "boot_time": 1662154781,
    "processes": 9831,
    "processes_running": 1,
    "processes_blocked": 0,
    "softirq": [
      3478985,
      35230,
      1252057,
      3467,
      128583,
      51014,
      0,
      171199,
      1241297,
      0,
      596138
    ]
  },
  "/proc/cpuinfo": [
    {
      "processor": 0,
      "vendor_id": "GenuineIntel",
      "cpu family": 6,
      "model": 142,
      "model name": "Intel(R) Core(TM) i5-7360U CPU @ 2.30GHz",
      "stepping": 9,
      "cpu MHz": 2303.998,
      "cache size": "4096 KB",
      "physical id": 0,
      "siblings": 1,
      "core id": 0,
      "cpu cores": 1,
      "apicid": 0,
      "initial apicid": 0,
      "fpu": true,
      "fpu_exception": true,
      "cpuid level": 22,
      "wp": true,
      "flags": [
        "fpu",
        "vme",
        "de",
        "pse",
        "tsc",
        "msr",
        "pae",
        "mce",
        "cx8",
        "apic",
        "sep",
        "mtrr",
        "pge",
        "mca",
        "cmov",
        "pat",
        "pse36",
        "clflush",
        "mmx",
        "fxsr",
        "sse",
        "sse2",
        "ht",
        "syscall",
        "nx",
        "rdtscp",
        "lm",
        "constant_tsc",
        "rep_good",
        "nopl",
        "xtopology",
        "nonstop_tsc",
        "eagerfpu",
        "pni",
        "pclmulqdq",
        "monitor",
        "ssse3",
        "cx16",
        "pcid",
        "sse4_1",
        "sse4_2",
        "x2apic",
        "movbe",
        "popcnt",
        "aes",
        "xsave",
        "avx",
        "rdrand",
        "hypervisor",
        "lahf_lm",
        "abm",
        "3dnowprefetch",
        "fsgsbase",
        "avx2",
        "invpcid",
        "rdseed",
        "clflushopt",
        "md_clear",
        "flush_l1d"
      ],
      "bogomips": 4607.99,
      "clflush size": 64,
      "cache_alignment": 64,
      "address sizes": "39 bits physical, 48 bits virtual",
      "power management": null,
      "address_size_physical": 39,
      "address_size_virtual": 48,
      "cache_size_num": 4096,
      "cache_size_unit": "KB"
    }
  ]
}

Potential issue: duplicate input values get deduplicated, so if you are not expecting that and you are just iterating by number, you could be looking at the wrong value. 😦 Potential workaround: ensure your input is already deduplicated via the uniq command or similar.

muescha commented 7 months ago

somehow I don't like this case "The is the input string when slurping string lines.", I expect more a list and not a dict in this case.

I also fear that the order can be changed when it is an dict, and not a list.

I would like to have the "current" behaviour for normal slurp.

muescha commented 7 months ago

maybe it can be done with an additional option --key or --dict (maybe with options linenumbers or filename/cmd which means the command string, when it is an magic command)

so to get the /proc output:

jc --dict-cmd -p /proc/stat /proc/cpuinfo
cat uname.txt | jc --dict-linenumber --slurp --uname -p
cat uname.txt | jc --dict-input --slurp --uname -p
jc --dict-cmd -p ls
{
    "ls": [
        {
            "filename": "common.jar"
        },
        {
            "filename": "rider.jar"
        },
    ]
}
jc --dict-cmd -p ls
{
    "ls": [
        {
            "filename": "common.jar"
        },
        {
            "filename": "rider.jar"
        }
    ]
}
jc --key-cmd -p ls -al
{
    "ls -al": [
        {
            "filename": ".",
            "flags": "drwxr-xr-x@",
            "links": 13,
            "owner": "muescha",
            "group": "staff",
            "size": 416,
            "date": "Jan 18 17:26"
        },
        {
            "filename": "..",
            "flags": "drwxr-xr-x@",
            "links": 3,
            "owner": "muescha",
            "group": "staff",
            "size": 96,
            "date": "Jan 18 17:26"
        },
        {
            "filename": "common.jar",
            "flags": "-rw-r--r--@",
            "links": 1,
            "owner": "muescha",
            "group": "staff",
            "size": 24430405,
            "date": "Oct 20 12:55"
        },
        {
            "filename": "rider.jar",
            "flags": "-rw-r--r--@",
            "links": 1,
            "owner": "muescha",
            "group": "staff",
            "size": 9987,
            "date": "Oct 20 12:55"
        }
    ]
}
kellyjonbrazil commented 7 months ago

Yeah, I agree there are some issues with this method. I'm looking into using the original list output but maybe use the --meta-out option to add source information.

kellyjonbrazil commented 7 months ago

What about something like this?

Basically just wrapping in a dict and adding the slurped key that contains the data so that a single _jc_meta object can be attached with the --meta-out option that includes the original list of inputs?

% cat uname.txt  | jc --slurp --uname -p --meta-out
{
  "slurped": [
    {
      "machine": "x86_64",
      "kernel_name": "Darwin",
      "node_name": "Kellys-MBP.attlocal.net",
      "kernel_release": "22.6.0",
      "kernel_version": "Darwin Kernel Version 22.6.0: Wed Oct 4 21:25:26 PDT 2023; root:xnu-8796.141.3.701.17~4/RELEASE_X86_64"
    },
    {
      "machine": "x86_65",
      "kernel_name": "Darwin",
      "node_name": "Kellys-MBP.attlocal.net",
      "kernel_release": "22.6.0",
      "kernel_version": "Darwin Kernel Version 22.6.0: Wed Oct 4 21:25:26 PDT 2023; root:xnu-8796.141.3.701.17~4/RELEASE_X86_64"
    },
    {
      "machine": "x86_66",
      "kernel_name": "Darwin",
      "node_name": "Kellys-MBP.attlocal.net",
      "kernel_release": "22.6.0",
      "kernel_version": "Darwin Kernel Version 22.6.0: Wed Oct 4 21:25:26 PDT 2023; root:xnu-8796.141.3.701.17~4/RELEASE_X86_64"
    },
    {
      "machine": "x86_67",
      "kernel_name": "Darwin",
      "node_name": "Kellys-MBP.attlocal.net",
      "kernel_release": "22.6.0",
      "kernel_version": "Darwin Kernel Version 22.6.0: Wed Oct 4 21:25:26 PDT 2023; root:xnu-8796.141.3.701.17~4/RELEASE_X86_64"
    },
    {
      "machine": "x86_68",
      "kernel_name": "Darwin",
      "node_name": "Kellys-MBP.attlocal.net",
      "kernel_release": "22.6.0",
      "kernel_version": "Darwin Kernel Version 22.6.0: Wed Oct 4 21:25:26 PDT 2023; root:xnu-8796.141.3.701.17~4/RELEASE_X86_64"
    }
  ],
  "_jc_meta": {
    "parser": "uname",
    "timestamp": 1705953071.42138,
    "slice_start": null,
    "slice_end": null,
    "input_list": [
      "Darwin Kellys-MBP.attlocal.net 22.6.0 Darwin Kernel Version 22.6.0: Wed Oct  4 21:25:26 PDT 2023; root:xnu-8796.141.3.701.17~4/RELEASE_X86_64 x86_64",
      "Darwin Kellys-MBP.attlocal.net 22.6.0 Darwin Kernel Version 22.6.0: Wed Oct  4 21:25:26 PDT 2023; root:xnu-8796.141.3.701.17~4/RELEASE_X86_64 x86_65",
      "Darwin Kellys-MBP.attlocal.net 22.6.0 Darwin Kernel Version 22.6.0: Wed Oct  4 21:25:26 PDT 2023; root:xnu-8796.141.3.701.17~4/RELEASE_X86_64 x86_66",
      "Darwin Kellys-MBP.attlocal.net 22.6.0 Darwin Kernel Version 22.6.0: Wed Oct  4 21:25:26 PDT 2023; root:xnu-8796.141.3.701.17~4/RELEASE_X86_64 x86_67",
      "Darwin Kellys-MBP.attlocal.net 22.6.0 Darwin Kernel Version 22.6.0: Wed Oct  4 21:25:26 PDT 2023; root:xnu-8796.141.3.701.17~4/RELEASE_X86_64 x86_68"
    ]
  }
}
muescha commented 7 months ago

so the slurped key becomes only visible with the --meta-out? then it will be ok :)

kellyjonbrazil commented 7 months ago

Ok, not sure if this will please everyone but finalized on a slurp output that hopefully meets everyone's requirements. https://github.com/kellyjonbrazil/jc/commit/6a7f38388359fd2ac05b1b0aa639a94080e665e5

I went back to the original idea of slurping the data into a list. If the data is coming from /proc magic syntax, a _file field will be added to the data.

In addition, if --meta-out is used, then the data is further wrapped in a dictionary that looks like:

{
  "result": [<output_data>],
  "_jc_meta": {
    "parser": "url",
    "timestamp": 1706235558.654576,
    "slice_start": null,
    "slice_end": null,
    "input_list": [
      "http://www.google.com",
      "https://www.kelly.com/testing",
      "https://mail.apple.com"
    ]
  }
}

input_list contains a list of inputs (actual input strings or /proc filenames) so you can identify which item is which. This keeps everything in order and also works with duplicate entries.

kellyjonbrazil commented 7 months ago

Added in v1.25.0

v1gnesh commented 7 months ago

Hey @kellyjonbrazil, could you clarify the following please -

Will slurp/multi-line help in adding support in jc for message like these (slides 9 to 11) - http://dtsc.dfw.ibm.com/MVSDS/'HTTPD2.APPS.ZOSCLASS.PDF(Z05)'

The messages are structured such that the first char indicates if it's a multi-line or not reference.

kellyjonbrazil commented 7 months ago

@v1gnesh I don't seem to have access to the first link. The second link seems to rererence syslog messages. There are already some syslog parsers in jc:

These all either wrap multiple syslog messages in an array or output JSON Lines in a streaming fashion. The slurp functionality is more for parsers that only expect a single line or string, like an IP address for the --ip-address parser. Since the syslog parsers already expect multiple lines, they don't need the slurp functionality.

v1gnesh commented 7 months ago

@kellyjonbrazil First link - ah, that's probably a HTTPS redirect doing it. That link works with HTTP only. The syslog I've linked to is pretty exotic - from a (IBM Z) mainframe operating system called z/OS.

kellyjonbrazil commented 7 months ago

@v1gnesh I was able to open the presentation (had to add in the single quote at the end as it was being left off). The Slurp functionality won't have any affect on this type of data. It looks like a custom parser would need to be created for these types of syslog messages and the parser would need to automatically account for multiline messages in the parsing logic.