grafana / agent

Vendor-neutral programmable observability pipelines.
https://grafana.com/docs/agent/
Apache License 2.0
1.6k stars 487 forks source link

Flow Modules: pointer export incorrectly converted to a value when used as an argument #3565

Closed erikbaranowski closed 1 year ago

erikbaranowski commented 1 year ago

Background: This issue occurs in a very specific set of circumstances. I will show a couple working examples followed by 1 that demonstrates the issue. After that I have some additional technical details on the error and a possible hint on where things are having issues in the code.

Working Example 1

In this example we get the receiver as an argument to the module loader. Working as expected!

parent.river

prometheus.remote_write "default" {
    endpoint {
        url = env("METRICS_URL")

        basic_auth {
            username = env("METRICS_USERNAME")
            password = env("METRICS_PASSWORD")
        }
    }
}

module.file "agent_metrics" {
    filename = env("AGENT_CONFIG_FOLDER") + "/metrics/prometheus_scrape/module.river"
    arguments {
        address  = "http://0.0.0.0:12345/metrics"
        receiver = prometheus.remote_write.default.receiver
    }
}

module.river

argument "address" {
    optional = false
}

argument "receiver" {
    optional = false
}

argument "scrape_interval" {
    optional = true
    default  = "10s"
}

prometheus.scrape "default" {
    targets         = [{"__address__" = argument.address.value}]
    forward_to      = [argument.receiver.value]
    scrape_interval = argument.scrape_interval.value
}

Working Example 2

In this example we get the receiver as the export of another module loader. Working as expected!

parent.river

module.file "agent_metrics" {
    filename = env("AGENT_CONFIG_FOLDER") + "/metrics/prometheus_scrape/module.river"
    arguments {
        address  = "http://0.0.0.0:12345/metrics"
    }
}

prometheus_scrape/module.river

argument "address" {
    optional = false
}

argument "scrape_interval" {
    optional = true
    default  = "10s"
}

prometheus.scrape "default" {
    targets         = [{"__address__" = argument.address.value}]
        forward_to      = [module.file.prometheus_receiver.exports.receiver]
    scrape_interval = argument.scrape_interval.value
}

module.file "prometheus_receiver" {
    filename = env("AGENT_CONFIG_FOLDER") + "/metrics/prometheus_receiver/module.river"
    arguments {
        username = env("METRICS_USERNAME")
        password = env("METRICS_PASSWORD")
        url      = env("METRICS_URL")
    }
}

prometheus_receiver/module.river

argument "username" {
    optional = false
}

argument "password" {
    optional = false
}

argument "url" {
    optional = false
}

export "receiver" {
    value = prometheus.remote_write.default.receiver
}

prometheus.remote_write "default" {
    endpoint {
        url = argument.url.value

        basic_auth {
            username = argument.username.value
            password = argument.password.value
        }
    }
}

Broken Example

In this example we do a combination of the above examples. From the parent config we will get the receiver as an export from a module loader then pass it as an argument to a different module loader.

parent.river

module.file "agent_metrics" {
    filename = env("AGENT_CONFIG_FOLDER") + "/metrics/prometheus_scrape/module.river"
    arguments {
        address  = "http://0.0.0.0:12345/metrics"
        receiver = module.file.prometheus_receiver.exports.receiver
    }
}

module.file "prometheus_receiver" {
    filename = env("AGENT_CONFIG_FOLDER") + "/metrics/prometheus_receiver/module.river"
    arguments {
        username = env("METRICS_USERNAME")
        password = env("METRICS_PASSWORD")
        url      = env("METRICS_URL")
    }
}

prometheus_scrape/module.river

argument "address" {
    optional = false
}

argument "receiver" {
    optional = false
}

argument "scrape_interval" {
    optional = true
    default  = "10s"
}

prometheus.scrape "default" {
    targets         = [{"__address__" = argument.address.value}]
    forward_to      = [argument.receiver.value]
    scrape_interval = argument.scrape_interval.value
}

prometheus_receiver/module.river

argument "username" {
    optional = false
}

argument "password" {
    optional = false
}

argument "url" {
    optional = false
}

export "receiver" {
    value = prometheus.remote_write.default.receiver
}

prometheus.remote_write "default" {
    endpoint {
        url = argument.url.value

        basic_auth {
            username = argument.username.value
            password = argument.password.value
        }
    }
}

Error starting the agent:

Error: module.file.agent_metrics:26:21: argument.receiver.value expected capsule("storage.Appendable"), got capsule("prometheus.Interceptor")

Debugging

Diving in a little deeper into the code, prometheus.remote_write exports receiver as *prometheus.Interceptor not prometheus.Interceptor. Somewhere along the path it is getting translated from a pointer into its value and causing the error.

I stepped through the code in pkg/flowinternal/controller/component.go and confirmed that when evaluate is being called it does in fact incorrectly set the argument to the value instead of the pointer in this case.

image
mattdurham commented 1 year ago

Adding some findings.

makeValue is converting the pointer to a value,

I changed the code in value.go:292 to the below and it works. Though tests fail.

if v.Kind() == reflect.Pointer {
        if v.IsNil() {
            return Null
        }
        _ = v.Elem()
    }

Copying the arguments transforms it from a pointer to a component and not sure why that only happens in certain cases.

mattdurham commented 1 year ago

Specifically any seems to be the cause. The below test fails.

func TestPointerToValue(t *testing.T) {
    type fss struct{}
    f := &fss{}
    mm := make(map[string]any)
    mm["t"] = f
    out := make(map[string]any)
    err := value.Decode(value.Encode(mm), &out)
    require.NoError(t, err)
    require.True(t, mm["t"] == out["t"])
}

the below succeeds


func TestPointerToValueSpecific(t *testing.T) {
    type fss struct{}
    f := &fss{}
    mm := make(map[string]*fss)
    mm["t"] = f
    out := make(map[string]*fss)
    err := value.Decode(value.Encode(mm), &out)
    require.NoError(t, err)
    require.True(t, mm["t"] == out["t"])
}
rfratto commented 1 year ago

I think the issue is here:

https://github.com/grafana/agent/blob/8fd9f6fd0c8ada9b1d7caea6798f28edd5d65833/pkg/river/internal/value/decode.go#L419-L423

I think this should be assigning the pointer value if it's addressable:

    case TypeFunction, TypeCapsule:
        // Functions and capsules must be directly assigned since there's no
        // "generic" representation for either.
        if val.rv.CanAddr() {
            into.Set(val.rv.Addr()) 
        } else {
            into.Set(val.rv)
        }
        return nil

makeValue always fully deferences pointers for consistency in the implementation, so if decoding needs to retain the pointer value (which it should here), the individual decode functions need to be updated.

You can see a few places in decode.go where we use the CanAddr()/Addr() pattern to not lose the pointer during decode.

Note that CanAddr() will only be true if the value was originally a pointer when passed to makeValue, which is the case here.

mattdurham commented 1 year ago

Will look into the above. Following the two examples the containsAny in creates two different code paths in the above. Since map[string]any says it has any then it cant directly assign.

I have a working solution, but its a bit bespokish. It basically handles map[string]any in a more specific use case, basically says that is the value of a map is any then you can assign. Primarily because I wanted to fiddle and limit the scope. Either way its an edge case and we can refine it next week.

mattdurham commented 1 year ago

Created a PR to handle this use case, its might be to narrow of a fix but that also means its impact is more limited than trying to fix other places.

rfratto commented 1 year ago

I opened #3620 to demonstrate what I was talking about in my earlier comment.