vectordotdev / vector

A high-performance observability data pipeline.
https://vector.dev
Mozilla Public License 2.0
17.77k stars 1.57k forks source link

Allow mutiple http_server sources share the same address with difference paths #20105

Open tai-nd opened 7 months ago

tai-nd commented 7 months ago

A note for the community

Use Cases

sources:
  journald_logs_http:
    type: http_server
    address: "0.0.0.0:8000"
    path: /journald
    decoding:
      codec: json
  docker_logs_http:
    type: http_server
    address: "0.0.0.0:8001"
    path: /docker
    decoding:
      codec: json

with this config vector is yelling:

Resource `tcp 0.0.0.0:8000` is claimed by multiple components: {ComponentKey { id: "docker_logs_http" }, ComponentKey { id: "journald_logs_http" }}

Attempted Solutions

I know it is possible to work around using different ports or adding some remap/routing logic, but having it handled by vector would reduce the complexity of adding additional routing logic.

Proposal

No response

References

No response

Version

0.36.0

jszwedko commented 7 months ago

Thanks for opening this @tai-nd ! Given Vector's current architecture I think it'd be more likely that we could add support for multiple paths within a single http_server source like:

  journald_logs_http:
    type: http_server
    address: "0.0.0.0:8000"
    paths: [/journald, /docker]
    decoding:
      codec: json

Where the route transform could be used to split them later if needed. This is given that each source is responsible for binding to resources like ports. It'd take some significant refactoring to change that.

hhromic commented 7 months ago

This is an interesting and useful feature request, +1 from us!

@jszwedko instead of a route transform for splitting, how about leveraging the multi-output feature for sources? With an opt-in option similar to the datadog_agent source?

jszwedko commented 7 months ago

This is an interesting and useful feature request, +1 from us!

@jszwedko instead of a route transform for splitting, how about leveraging the multi-output feature for sources? With an opt-in option similar to the datadog_agent source?

That's a good idea!

AdaptiveStep commented 6 months ago

if there are arrays, please make sure to support template syntax for anything that is related to paths, adressses and uri's. Maybe all fields in vector could support template syntax . The http sink could especially need this.

gaby commented 2 months ago

👍 For this, I current have +10 http servers on different ports and it's becoming hard to manage.

jszwedko commented 2 months ago

One workaround is to set https://vector.dev/docs/reference/configuration/sources/http_server/#strict_path to false to allow any path and check the path later for routing. This has the disadvantage of not being able to return 404s for paths that shouldn't exist.

gaby commented 2 months ago

@jszwedko That works, thank you! I took me 1hr to figure out how to use type = route, but got it working. 😆

I do now have the same problem at the Sink side. since I have 1 sink per /path. Unless there's a way of using a variable with the sink URL that i'm not aware of?

[sinks.my_sink_id]
type = "http"
inputs = [ "my-source-or-transform-id" ]
uri = "https://10.22.212.22:9000/{. fieldname}"
jszwedko commented 2 months ago

@jszwedko That works, thank you! I took me 1hr to figure out how to use type = route, but got it working. 😆

I do now have the same problem at the Sink side. since I have 1 sink per /path. Unless there's a way of using a variable with the sink URL that i'm not aware of?

[sinks.my_sink_id]
type = "http"
inputs = [ "my-source-or-transform-id" ]
uri = "https://10.22.212.22:9000/{. fieldname}"

Unfortunately that is a missing feature. See https://github.com/vectordotdev/vector/issues/1155

gaby commented 2 months ago

Thanks! I will watch that ticket.