[Help needed] File collector cluster name not provided

OlofHaglund commented 1 month ago

I am stuck in my debugging process and in need of assistance to continue the investigation of the following error I have received.

I have just downloaded KubeHound to investigate its usage in our team.

The debug flag is enabled on, but not receiving any debug logs. kubernetes-admin is the same name as the context used in the dump. I have also tried with the cluster name that is abcd which I receive when running kubectl config get-clusters

ehagool @ server @ network ~
└─ ▶ sudo ./kubehound --debug ingest local kubehound_kubernetes-admin_01j5dekd8jsremxa9a39s77fz4.tar.gz --cluster kubernetes-admin
INFO[15:00:53] Loading application from inline command
WARN[15:00:53] No local config file was found (kubehound.yaml)
INFO[15:00:53] Using /home/ehagool/kubehound for default config
INFO[15:00:53] Initializing application telemetry
WARN[15:00:53] Telemetry disabled via configuration
INFO[15:00:53] Starting KubeHound (run_id: 01j5dmdmf8m44ttk0w3p5gpss1)
INFO[15:00:53] Initializing providers (graph, cache, store)
INFO[15:00:53] Loading cache provider
INFO[15:00:53] Loaded memcache cache provider
INFO[15:00:53] Loading store database provider
INFO[15:00:53] Loaded mongodb store provider
INFO[15:00:54] Loading graph database provider
INFO[15:00:54] Loaded janusgraph graph provider
INFO[15:00:54] Running the ingestion pipeline
INFO[15:00:54] Loading Kubernetes data collector client
INFO[15:00:54] Creating file collector from directory /tmp/kh-local-ingest-2266015422
INFO[15:00:54] Loaded local-file-collector collector client
INFO[15:00:54] Starting Kubernetes raw data ingest
INFO[15:00:54] Loading data ingestor
INFO[15:00:54] Running dependency health checks
FATA[15:00:54] ingest build data: raw data ingest: ingestor dependency health check: 1 error occurred:
        * file collector cluster name not provided

edznux-dd commented 1 month ago

Hey, thanks for the report!

I managed to reproduce locally: the sudo looses the kubectx context (since I imagine you ran the kubectx as your user instead), which make the setting current-context:, in ~/.kube/config empty.

There's definitely a bug as you provided the --cluster flag, that should have overwrite that, I'm looking into it and will drafting a PR.

As a workaround until it's done and released, you could try to:

run kubehound without sudo (I imagine you have docker not accessible from your user, so that might not work)
set the current-context: in your root user kube config?

edznux-dd commented 1 month ago

I created https://github.com/DataDog/KubeHound/pull/248 for a "quick fix" if the workaround don't work for you, let me know if that works for you :). I don't think I'll have time to test and cut a new release until next week, but if you want to test from that branch: make build && ./build/bin/kubehound --debug ingest local kubehound_kubernetes-admin_01j5dekd8jsremxa9a39s77fz4.tar.gz should do the trick.

One aspect that I didn't realize at first is that we shouldn't even require the --cluster flag in the first place: it's already in the filename from the dump.

Because relying on the filename isn't that great, I've created a PoC (non working) here #247 to discuss the addition of a metadata.json file that contains whatever is needed for the ingestor side.

It's been on our todo for a while but never got time to implement it!

OlofHaglund commented 1 month ago

Hi, thanks for you reply. One of your notes gave a reply to what was the issue :)

But firstly your PoC gives nil pointer error.

ehagool @ kali-n212 (UCC) @ seroics10745 ~
└─ ▶ ./KubeHound/bin/build/kubehound --debug ingest local kubehound_kubernetes-admin_01j5dekd8jsremxa9a39s77fz4.tar.gz
INFO[11:14:45] Loading application from inline command
WARN[11:14:46] No local config file was found (kubehound.yaml)
INFO[11:14:46] Using /home/ehagool/kubehound for default config
INFO[11:14:46] Initializing application telemetry
WARN[11:14:46] Telemetry disabled via configuration
WARN[11:14:46] parsing path failedInvalid path provided: "kubehound_kubernetes-admin_01j5dekd8jsremxa9a39s77fz4.tar.gz"
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x8 pc=0x2ae1b54]

goroutine 1 [running]:
github.com/DataDog/KubeHound/pkg/kubehound/core.CoreLocalIngest({0x3dea9b8, 0x5bbf9e0}, 0xc00073f380, {0x7fff35172485, 0x3c})
        /home/ehagool/KubeHound/pkg/kubehound/core/core_ingest_local.go:21 +0xb4
main.init.func14(0x5872660, {0xc000c92720, 0x1, 0x34134dd?})
        /home/ehagool/KubeHound/cmd/kubehound/ingest.go:35 +0x99
github.com/spf13/cobra.(*Command).execute(0x5872660, {0xc000c926e0, 0x2, 0x2})
        /home/ehagool/go/pkg/mod/github.com/spf13/cobra@v1.8.0/command.go:983 +0xaca
github.com/spf13/cobra.(*Command).ExecuteC(0x5872c20)
        /home/ehagool/go/pkg/mod/github.com/spf13/cobra@v1.8.0/command.go:1115 +0x3ff
github.com/spf13/cobra.(*Command).Execute(...)
        /home/ehagool/go/pkg/mod/github.com/spf13/cobra@v1.8.0/command.go:1039
main.main()
        /home/ehagool/KubeHound/cmd/kubehound/main.go:10 +0x58

But the issue was what you mentioned about the .kube/config It didn't exist any config on my machine as I don't have a Kubernetes installation on it. The environment I run against is a locked down environment so I take a dump locally there with kubehound and then copy over the dump to my local machine.

I think a metadata.json file is good approach as a Kubernetes installation shouldn't be a hard requirement to have the ingest to work properly :)

Thanks for your assistance.

edznux-dd commented 1 month ago

But firstly your PoC gives nil pointer error.

for https://github.com/DataDog/KubeHound/pull/247 or https://github.com/DataDog/KubeHound/pull/248 ? if #247 that's expected, if that's for #248, i'll take a look.

The environment I run against is a locked down environment so I take a dump locally there with kubehound and then copy over the dump to my local machine.

Yep, the dumper/ingester approach was made for this kind of use case in mind :)

I think a metadata.json file is good approach as a Kubernetes installation shouldn't be a hard requirement to have the ingest to work properly :)

Thanks for the feedback!

OlofHaglund commented 1 month ago

I think it is for #248, the branch I'm on my machine when I built it is edouard/fix-cluster-name-propagation

DataDog / KubeHound

[Help needed] File collector cluster name not provided #246