datalust / seq-input-healthcheck

Periodically GET an HTTP resource and write response metrics to Seq
Other
28 stars 11 forks source link

Data extraction expression for outer array instead of root named property #41

Open c0shea opened 1 month ago

c0shea commented 1 month ago

I'm trying to create a health check to alert on network partitions in RabbitMQ. I can successfully call the RabbitMQ API (http://rabbitmq:15672/api/nodes), but I can't figure out a data extraction expression that works. The JSON response looks like this:

[
    {
        "partitions": [],
        "os_pid": "3480",
        "fd_total": 65536,
        "sockets_total": 58893,
        "mem_limit": 6871733043,
        "mem_alarm": false,
        "disk_free_limit": 2000000000,
        "disk_free_alarm": false,
        "proc_total": 1048576,
        "rates_mode": "basic",
        "uptime": 571207655,
        "run_queue": 1,
        "processors": 2,
        "exchange_types": [
            {
                "name": "topic",
                "description": "AMQP topic exchange, as per the AMQP specification",
                "enabled": true
            },
            {
                "name": "headers",
                "description": "AMQP headers exchange, as per the AMQP specification",
                "enabled": true
            },
            {
                "name": "direct",
                "description": "AMQP direct exchange, as per the AMQP specification",
                "enabled": true
            },
            {
                "name": "fanout",
                "description": "AMQP fanout exchange, as per the AMQP specification",
                "enabled": true
            }
        ]
    },
    {
        "partitions": [],
        "os_pid": "3004",
        "fd_total": 65536,
        "sockets_total": 58893,
        "mem_limit": 6871733043,
        "mem_alarm": false,
        "disk_free_limit": 2000000000,
        "disk_free_alarm": false,
        "proc_total": 1048576,
        "rates_mode": "basic",
        "uptime": 42968677,
        "run_queue": 1,
        "processors": 2,
        "exchange_types": [
            {
                "name": "headers",
                "description": "AMQP headers exchange, as per the AMQP specification",
                "enabled": true
            },
            {
                "name": "direct",
                "description": "AMQP direct exchange, as per the AMQP specification",
                "enabled": true
            },
            {
                "name": "topic",
                "description": "AMQP topic exchange, as per the AMQP specification",
                "enabled": true
            },
            {
                "name": "fanout",
                "description": "AMQP fanout exchange, as per the AMQP specification",
                "enabled": true
            }
        ]
    }
]

I'm trying to have the health check fail if the partitions array is not empty in any of the array elements. My cluster contains two nodes, so I tried Length([0].partitions) + Length([1].partitions) but that didn't work. [?].partitions also didn't work. Is there a way to extract a value when the outer element is an array instead of a root named property?

nblumhardt commented 1 month ago

Thanks for raising this. I think the expression should be:

Length(@Properties[?].partitions) = 0

But I'm not sure it will work - first because the app sniffs for { to detect JSON:

https://github.com/datalust/seq-input-healthcheck/blob/dev/src/Seq.Input.HealthCheck/HttpHealthCheck.cs#L187

And second because of the way we currently use Seq.Syntax assuming that the content can be converted directly into a LogEvent:

https://github.com/datalust/seq-input-healthcheck/blob/dev/src/Seq.Input.HealthCheck/Data/JsonDataExtractor.cs#L51

I think both problems can be overcome, but we'll need to exercise some care so it's not a super quick fix right now, unfortunately.

Trying to think up a shorter-term workaround but coming up blank currently; will loop back if one appears.

c0shea commented 1 month ago

@nblumhardt Thanks for looking into it! No worries if it's not an easy fix that can be done soon. Was more curious to get it working, but I have other workarounds I can put in place in the meantime.