kurtosis-tech / kurtosis

A platform for packaging and launching ephemeral backend stacks with a focus on approachability for the average developer.
https://docs.kurtosistech.com/
Apache License 2.0
353 stars 51 forks source link

Kurtosis does not work on Arch linux #2508

Open hexash42 opened 3 months ago

hexash42 commented 3 months ago

What's your CLI version?

0.90.1

Description & steps to reproduce

Downloaded the amd64 binary on arch linux. Run the quickstart example (Or basically anything else) kurtosis run github.com/kurtosis-tech/basic-service-package --enclave quickstart

This is the output (hangs forever)

~ ❱ kurtosis run github.com/kurtosis-tech/basic-service-package --enclave quickstart
INFO[2024-07-02T16:28:31+03:00] No Kurtosis engine was found; attempting to start one...
INFO[2024-07-02T16:28:31+03:00] Starting the centralized logs components...
INFO[2024-07-02T16:28:31+03:00] Centralized logs components started.
INFO[2024-07-02T16:28:32+03:00] Reverse proxy started.
INFO[2024-07-02T16:28:33+03:00] Successfully started Kurtosis engine
INFO[2024-07-02T16:28:33+03:00] Creating a new enclave for Starlark to run inside...

This is the engine logs:

DEBU[2024-07-02T13:28:33Z][backend_creator.go:getLocalDockerKurtosisBackend] Connecting to Docker daemon via unix socket '/var/run/docker.sock' 
DEBU[2024-07-02T13:28:33Z][log_file_manager.go:func1] Scheduling log removal for log retention every '6h0m0s' hours... 
DEBU[2024-07-02T13:28:33Z][log_file_manager.go:RemoveLogsBeyondRetentionPeriod] Removed logs beyond retention period at the following path: '/var/log/kurtosis/2024/26/' 
DEBU[2024-07-02T13:28:33Z][log_file_manager.go:func2] Scheduling log file path creation every '1m0s' minutes... 
INFO[2024-07-02T13:28:33Z][main.go:restApiServer] Running REST API server...                   
DEBU[2024-07-02T13:28:33Z][main.go:func1] Created environment js file with content: 
window.env = {}; 
INFO[2024-07-02T13:28:33Z][main.go:restApiServer] Setting-up CORS policy to accept requests from origins: [*] 
INFO[2024-07-02T13:28:33Z][server.go:RunEnclaveManagerApiServer] Web server running and listening on port 8081 
INFO[2024-07-02T13:28:33Z][enclave_rest_api_handler.go:getGrpcClientConn] No API container info is available for enclave a87bf298522047518c5c75b5245f1472 
WARN[2024-07-02T13:28:33Z][enclave_rest_api_handler.go:refreshEnclaveConnections] Unavailable gRPC connection to enclave 'a87bf298522047518c5c75b5245f1472', skipping it! 

   ____    __
  / __/___/ /  ___
 / _// __/ _ \/ _ \
/___/\__/_//_/\___/ v4.11.3
High performance, minimalist Go web framework
https://echo.labstack.com
____________________________________O/_______
                                    O\
⇨ http server started on [::]:9779
DEBU[2024-07-02T13:28:33Z][engine_connect_server_service.go:CreateEnclave] args: enclave_name:"quickstart" api_container_version_tag:"" api_container_log_level:"debug" mode:TEST should_apic_run_in_debug_mode:false 
DEBU[2024-07-02T13:28:33Z][docker_kurtosis_backend_enclave_functions.go:CreateEnclave] Creating Docker network for enclave '6ae3a6dc6b8b413399694b7f3cd5c0b0'... 
DEBU[2024-07-02T13:28:33Z][docker_kurtosis_backend_enclave_functions.go:CreateEnclave] Docker network for enclave '6ae3a6dc6b8b413399694b7f3cd5c0b0' created successfully with ID '6a4edea41be43da5753bae2b5c9d4a4009e69fa7c29b01f8875e78e3a231dd1e' 
DEBU[2024-07-02T13:28:33Z][shared_helpers.go:getReverseProxyObjectFromContainerInfo] Enclave networks: 'map[5b202c1134cbba6702c4ad3c532963aba47d7100f9689b7eaa93dbe56a7bd212:172.16.0.2]' 
DEBU[2024-07-02T13:28:34Z][create_logs_collector.go:CreateLogsCollectorForEnclave] Creating logs collector for enclave '6ae3a6dc6b8b413399694b7f3cd5c0b0' 
DEBU[2024-07-02T13:28:34Z][docker_manager.go:FetchImage] Fetching image 'alpine:3.17' with image download mode: missing 
DEBU[2024-07-02T13:28:34Z][docker_manager.go:getContainerHostConfig] Binds: [kurtosis-logs-collector-vol--6ae3a6dc6b8b413399694b7f3cd5c0b0:/fluent-bit/etc] 
DEBU[2024-07-02T13:28:34Z][docker_manager.go:CreateAndStartContainer] Created container with ID '83272470daa10a6773c8f4879d669f7e2291151d0b5e53f6ae8f75e8ffd11883' from image 'alpine:3.17' 
DEBU[2024-07-02T13:28:34Z][fluentbit_configuration_creator.go:createFluentbitConfigFileInVolume] The Fluentbit config file with content '
[SERVICE]
    log_level debug
    http_server On
    http_listen 0.0.0.0
    http_port 9712
    storage.path /fluent-bit/etc/storage/
[INPUT]
    name forward
    listen 0.0.0.0
    port 9713
    storage.type  filesystem
[OUTPUT]
    name forward
    match *
    host 172.17.0.2
    port 9714
' was successfully added into the volume 
DEBU[2024-07-02T13:28:34Z][docker_manager.go:FetchImage] Fetching image 'fluent/fluent-bit:1.9.7' with image download mode: missing 
DEBU[2024-07-02T13:28:34Z][docker_manager.go:getContainerHostConfig] Binds: [kurtosis-logs-collector-vol--6ae3a6dc6b8b413399694b7f3cd5c0b0:/fluent-bit/etc] 
DEBU[2024-07-02T13:28:34Z][docker_manager.go:CreateAndStartContainer] Created container with ID '7c75bcc6ec837b9e2de16c009871efe3f516905dd898ceea2de56bd70b2cf618' from image 'fluent/fluent-bit:1.9.7' 
DEBU[2024-07-02T13:28:35Z][create_logs_collector.go:CreateLogsCollectorForEnclave] Checking for logs collector availability in enclave '6ae3a6dc6b8b413399694b7f3cd5c0b0'... 

Enclave log from kurtosis-dump:

Fluent Bit v1.9.7
* Copyright (C) 2015-2022 The Fluent Bit Authors
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

[2024/07/02 13:28:35] [ info] Configuration:
[2024/07/02 13:28:35] [ info]  flush time     | 1.000000 seconds
[2024/07/02 13:28:35] [ info]  grace          | 5 seconds
[2024/07/02 13:28:35] [ info]  daemon         | 0
[2024/07/02 13:28:35] [ info] ___________
[2024/07/02 13:28:35] [ info]  inputs:
[2024/07/02 13:28:35] [ info]      forward
[2024/07/02 13:28:35] [ info] ___________
[2024/07/02 13:28:35] [ info]  filters:
[2024/07/02 13:28:35] [ info] ___________
[2024/07/02 13:28:35] [ info]  outputs:
[2024/07/02 13:28:35] [ info]      forward.0
[2024/07/02 13:28:35] [ info] ___________
[2024/07/02 13:28:35] [ info]  collectors:
[2024/07/02 13:28:35] [ info] [fluent bit] version=1.9.7, commit=265783ebe9, pid=1
[2024/07/02 13:28:35] [debug] [engine] coroutine stack size: 24576 bytes (24.0K)
[2024/07/02 13:28:35] [ info] [storage] created root path /fluent-bit/etc/storage/
[2024/07/02 13:28:35] [ info] [storage] version=1.2.0, type=memory+filesystem, sync=normal, checksum=disabled, max_chunks_up=128
[2024/07/02 13:28:35] [ info] [storage] backlog input plugin: storage_backlog.1
[2024/07/02 13:28:35] [ info] [cmetrics] version=0.3.5
[2024/07/02 13:28:35] [debug] [forward:forward.0] created event channels: read=21 write=22
[2024/07/02 13:28:35] [debug] [in_fw] Listen='0.0.0.0' TCP_Port=9713
[2024/07/02 13:28:35] [ info] [input:forward:forward.0] listening on 0.0.0.0:9713
[2024/07/02 13:28:35] [debug] [storage_backlog:storage_backlog.1] created event channels: read=24 write=25
[2024/07/02 13:28:35] [ info] [input:storage_backlog:storage_backlog.1] queue memory limit: 95.4M
[2024/07/02 13:28:35] [debug] [forward:forward.0] created event channels: read=26 write=27
[2024/07/02 13:28:35] [debug] [router] match rule forward.0:forward.0
[2024/07/02 13:28:35] [debug] [router] match rule storage_backlog.1:forward.0
[2024/07/02 13:28:35] [ info] [output:forward:forward.0] worker #0 started
[2024/07/02 13:28:35] [ info] [output:forward:forward.0] worker #1 started
[2024/07/02 13:28:35] [ info] [http_server] listen iface=0.0.0.0 tcp_port=9712
[2024/07/02 13:28:35] [ info] [sp] stream processor started

The engine is 100% stuck and does not respond to stuff like kurtosis enclave ls - it must first be restarted

Desired behavior

I expect this to work just like the quickstart example

What is the severity of this bug?

Critical; I am blocked and Kurtosis is unusable for me because of this bug.

What area of the product does this pertain to?

CLI: the Command Line Interface

skylarmb commented 3 months ago

what is the status of the logs aggregator docker container (the image it runs is timberio/vector). I think i ran into something like this before on fedora, but not sure it was the same issue:

image

hexash42 commented 3 months ago

Looks like it is up

846a688ecd4f   fluent/fluent-bit:1.9.7         "/fluent-bit/bin/flu…"   37 seconds ago   Up 36 seconds   2020/tcp                                                                           kurtosis-logs-collector--0ce18db33a5d4a28b6216331c96a684f

And here are the logs from docker log

Fluent Bit v1.9.7
* Copyright (C) 2015-2022 The Fluent Bit Authors
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

[2024/07/03 14:56:10] [ info] Configuration:
[2024/07/03 14:56:10] [ info]  flush time     | 1.000000 seconds
[2024/07/03 14:56:10] [ info]  grace          | 5 seconds
[2024/07/03 14:56:10] [ info]  daemon         | 0
[2024/07/03 14:56:10] [ info] ___________
[2024/07/03 14:56:10] [ info]  inputs:
[2024/07/03 14:56:10] [ info]      forward
[2024/07/03 14:56:10] [ info] ___________
[2024/07/03 14:56:10] [ info]  filters:
[2024/07/03 14:56:10] [ info] ___________
[2024/07/03 14:56:10] [ info]  outputs:
[2024/07/03 14:56:10] [ info]      forward.0
[2024/07/03 14:56:10] [ info] ___________
[2024/07/03 14:56:10] [ info]  collectors:
[2024/07/03 14:56:10] [ info] [fluent bit] version=1.9.7, commit=265783ebe9, pid=1
[2024/07/03 14:56:10] [debug] [engine] coroutine stack size: 24576 bytes (24.0K)
[2024/07/03 14:56:10] [ info] [storage] created root path /fluent-bit/etc/storage/
[2024/07/03 14:56:10] [ info] [storage] version=1.2.0, type=memory+filesystem, sync=normal, checksum=disabled, max_chunks_up=128
[2024/07/03 14:56:10] [ info] [storage] backlog input plugin: storage_backlog.1
[2024/07/03 14:56:10] [ info] [cmetrics] version=0.3.5
[2024/07/03 14:56:10] [debug] [forward:forward.0] created event channels: read=21 write=22
[2024/07/03 14:56:10] [debug] [in_fw] Listen='0.0.0.0' TCP_Port=9713
[2024/07/03 14:56:10] [ info] [input:forward:forward.0] listening on 0.0.0.0:9713
[2024/07/03 14:56:10] [debug] [storage_backlog:storage_backlog.1] created event channels: read=24 write=25
[2024/07/03 14:56:10] [ info] [input:storage_backlog:storage_backlog.1] queue memory limit: 95.4M
[2024/07/03 14:56:10] [debug] [forward:forward.0] created event channels: read=26 write=27
[2024/07/03 14:56:10] [debug] [router] match rule forward.0:forward.0
[2024/07/03 14:56:10] [debug] [router] match rule storage_backlog.1:forward.0
[2024/07/03 14:56:10] [ info] [output:forward:forward.0] worker #0 started
[2024/07/03 14:56:10] [ info] [output:forward:forward.0] worker #1 started
[2024/07/03 14:56:10] [ info] [http_server] listen iface=0.0.0.0 tcp_port=9712
[2024/07/03 14:56:10] [ info] [sp] stream processor started
skylarmb commented 2 months ago

yeah nvm then, this has similar symptoms to but is not the same issue I experienced on aarch64 Fedora, must be something else. I have an amd64 Manjaro machine I will test on later today