redhat-developer / mapt

Multi Architecture Provisioning Tool
Apache License 2.0
9 stars 7 forks source link

[BUG] Failed to create fedora spot instance using mapt #327

Closed praveenkumar closed 2 weeks ago

praveenkumar commented 3 weeks ago
podman run -d  --name create-fedora \
            -v ${PWD}:/workspace:z \
            -e AWS_ACCESS_KEY_ID=${AWS_ACCESS_KEY_ID} \
            -e AWS_SECRET_ACCESS_KEY=${AWS_SECRET_ACCESS_KEY} \
            -e AWS_DEFAULT_REGION=${AWS_DEFAULT_REGION} \
            quay.io/redhat-developer/mapt:v0.8.0-dev-linux aws fedora create \
                --project-name fedora \
                --backed-url "file:///workspace" \
                --conn-details-output "/workspace" \
                --tags project=crc,environment=local,user=prkumar \
                --spot \
                --airgap \
                --cpus 6 \
                --nested-virt \
                --memory 16

DEBU running 'mapt aws fedora create'             
DEBU context initialized for mapte6ae042d         
DEBU checking stack spotOption-fedora             
panic: interface conversion: interface {} is nil, not string

goroutine 1 [running]:
github.com/redhat-developer/mapt/pkg/provider/aws/modules/spot.getOutputs(0xc0007e0240)
    /workspace/pkg/provider/aws/modules/spot/stack.go:137 +0x3a5
github.com/redhat-developer/mapt/pkg/provider/aws/modules/spot.SpotOptionRequest.Create({{0x556e642, 0x4}, {0x5577b19, 0xa}, {0xc0005df710, 0x3, 0x3}, {0xc0022de060, 0x26}, {0x557063f, ...}})
    /workspace/pkg/provider/aws/modules/spot/stack.go:69 +0x2b4
github.com/redhat-developer/mapt/pkg/provider/aws/action/fedora.Create(0xc000532140)
    /workspace/pkg/provider/aws/action/fedora/fedora.go:87 +0x718
github.com/redhat-developer/mapt/cmd/mapt/cmd/aws/hosts.getFedoraCreate.func1(0xc000005b00, {0xc0004b43c0, 0x0, 0xf})
    /workspace/cmd/mapt/cmd/aws/hosts/fedora.go:74 +0x77b
github.com/redhat-developer/mapt/cmd/mapt/cmd.executeWithLogging.func1(0xc000005b00, {0xc0004b43c0, 0x0, 0xf})
    /workspace/cmd/mapt/cmd/root.go:93 +0x123
github.com/spf13/cobra.(*Command).execute(0xc000005b00, {0xc0004b42d0, 0xf, 0xf})
    /workspace/vendor/github.com/spf13/cobra/command.go:983 +0xfbb
github.com/spf13/cobra.(*Command).ExecuteC(0x8985f60)
    /workspace/vendor/github.com/spf13/cobra/command.go:1115 +0xa2e
github.com/spf13/cobra.(*Command).Execute(0x8985f60)
    /workspace/vendor/github.com/spf13/cobra/command.go:1039 +0x32
github.com/spf13/cobra.(*Command).ExecuteContext(0x8985f60, {0x5acb390, 0x8aa93a0})
    /workspace/vendor/github.com/spf13/cobra/command.go:1032 +0x73
github.com/redhat-developer/mapt/cmd/mapt/cmd.Execute()
    /workspace/cmd/mapt/cmd/root.go:70 +0x74
main.main()
    /workspace/cmd/mapt/main.go:6 +0xf
anjannath commented 2 weeks ago

tried to use the same image for creating fedora instance, although i didn't get the panic, but the process gets stuck for ~15mins trying to download the pulumi aws provider:

DEBU running 'mapt aws fedora create'
DEBU context initialized for mapt223e28aa
DEBU checking stack spotOption-fedora
DEBU managing stack spotOption-fedora
INFO Updating (spotOption-fedora):
INFO
INFO  +  pulumi:pulumi:Stack fedora-spotOption-fedora creating (0s)
DEBU grouped prices map[{ ap-south-1a 0 0 0 c5.metal}:[{0x4000630df0 c5.metal Linux/UNIX 0x4000630de0 2024-11-07 07:32:29 +0000 UTC {}} {0x4000630f10 c5.metal Linux/UNIX 0x4000630f00 2024-11-06 23:31:36 +0000 UTC {}}] { ap-south-1a 0 0 0 c5d.metal}:[{0x4000630db0 c5d.metal Linux/UNIX 0x4000630da0 2024-11-07 08:01:32 +0000 UTC {}} {0x4000630eb0 c5d.metal Linux/UNIX 0x4000630ea0 2024-11-07 01:46:32 +0000 UTC {}}] { ap-south-1a 0 0 0 c5n.metal}:[{0x4000630ed0 c5n.metal Linux/UNIX 0x4000630ec0 2024-11-07 01:46:25 +0000 UTC {}}] { ap-south-1b 0 0 0 c5.metal}:[{0x4000630e70 c5.metal Linux/UNIX 0x4000630e60 2024-11-07 04:47:21 +0000 UTC {}}] { ap-south-1b 0 0 0 c5d.metal}:[{0x4000630e10 c5d.metal Linux/UNIX 0x4000630e00 2024-11-07 07:16:33 +0000 UTC {}}] { ap-south-1b 0 0 0 c5n.metal}:[{0x4000630e50 c5n.metal Linux/UNIX 0x4000630e40 2024-11-07 05:02:10 +0000 UTC {}}] { ap-south-1c 0 0 0 c5.metal}:[{0x4000630dd0 c5.metal Linux/UNIX 0x4000630dc0 2024-11-07 07:46:37 +0000 UTC {}} {0x4000630ef0 c5.metal Linux/UNIX 0x4000630ee0 2024-11-07 01:32:20 +0000 UTC {}}] 
[........]

DEBU len 38 checking Fedora-Cloud-Base-AmazonEC2.x86_64-40* in ap-south-1
DEBU Based on avg prices for instance types [c5.metal c5d.metal c5n.metal] is az ap-south-1b, current avg price is 0.80 and max price is 0.80 with a score of 9
INFO @ updating..........
INFO  +  rh:qe:aws:bso main-bso-bso creating (0s)
INFO  +  pulumi:pulumi:Stack fedora-spotOption-fedora created (7s)
INFO  +  rh:qe:aws:bso main-bso-bso created
INFO Outputs:
INFO     avg   : 0.8031
INFO     az    : "ap-south-1b"
INFO     max   : 0.8031
INFO     region: "ap-south-1"
INFO     score : 9
INFO
INFO Resources:
INFO     + 2 created
INFO
INFO Duration: 8s
INFO
DEBU managing stack stackFedoraBaremetal-fedora
INFO Updating (stackFedoraBaremetal-fedora):
INFO
INFO  +  pulumi:pulumi:Stack fedora-stackFedoraBaremetal-fedora creating (0s)
INFO @ updating.....
INFO  +  pulumi:pulumi:Stack fedora-stackFedoraBaremetal-fedora creating (2s) Downloading provider: aws

INFO @ updating........................................................................................................................................................................................................................................................................................................................................................................................................
INFO  +  pulumi:pulumi:Stack fedora-stackFedoraBaremetal-fedora creating (391s) warning: error downloading provider: stream error: stream ID 1; PROTOCOL_ERROR; received from peer
INFO @ updating......
INFO  +  pulumi:pulumi:Stack fedora-stackFedoraBaremetal-fedora creating (394s) Downloading provider: aws
adrianriobo commented 2 weeks ago

Did you use exact same params? nested + airgap + custom memory and cpu?

Yeah I notice the aws provider...I think this is the provider from terraform. Not from pulumi.

May I can create a different issue for this, as we have 2 options now:

1) Quick one, add the provider within the Containerfile so it is not downloaded on runtime

2) https://github.com/redhat-developer/mapt/issues/315 (But this is a huge one). Last time not all resources were available on the native..now is GA so may now we can fully migrate.

anjannath commented 2 weeks ago

Did you use exact same params? nested + airgap + custom memory and cpu?

yes, this is the command i ran:

 % podman run -d  --name create-fedora \
            -v ${PWD}:/workspace:z \
            -e AWS_ACCESS_KEY_ID=${AWS_ACCESS_KEY_ID} \
            -e AWS_SECRET_ACCESS_KEY=${AWS_SECRET_ACCESS_KEY} \
            -e AWS_DEFAULT_REGION=${AWS_DEFAULT_REGION} \
            quay.io/redhat-developer/mapt:v0.8.0-dev-linux aws fedora create \
                --project-name fedora \
                --backed-url "file:///workspace" \
                --conn-details-output "/workspace" \
                --tags project=user=anath \
                --spot \
                --airgap \
                --cpus 6 \
                --nested-virt \
                --memory 16
Trying to pull quay.io/redhat-developer/mapt:v0.8.0-dev-linux...

from the stack trace: the panic is occurring at https://github.com/redhat-developer/mapt/blob/57e23257ac7cf81c5d778b3933cbb2cc6c4bb796/pkg/provider/aws/modules/spot/stack.go#L137-L138

i think we should use the util.IfNillable() helper here

adrianriobo commented 2 weeks ago

So what would be the reason no error and outputs must be nil... Meaning there was no results for the spot request. So it should print some log (debug or error) before, even if we add the check here.

anjannath commented 2 weeks ago

this occurs when the spot price stack exists but it has no outputs which can happen if the create operation is interrupted, so if we run the following command:

 % ./out/mapt aws fedora create --spot --airgap --nested-virt --cpus 6 --memory 16 --project-name aws-mapt-fedora-test --backed-url file:///Users/anath/workspace --conn-details-output /tmp/fedora --tags user=anath
^C

stop it mid execution using Ctrl+c then re-ran the command, then we hit the panic as mentioned in this issue, we need to add a check in getOutputs() to see if the returned outputs map is empty