SneaksAndData / snd-cli-go

Command-line interface for Sneaks & Data
0 stars 0 forks source link

[BUG] Non-graceful handling of cancelling non-existing jobs #63

Open matt035343 opened 6 months ago

matt035343 commented 6 months ago

Description

snd algorithm cancel --algorithm store-auto-replenishment-crystal-orchestrator --id 3cf44c20-61e3-4f45-9121-0949378bd090

Error: failed to cancel run for algorithm store-auto-replenishment-crystal-orchestrator with run id 3cf44c20-61e3-4f45-9121-0949378bd090: error making request to https://crystal.test.sneaksanddata.com/algorithm/v1.2/cancel/store-auto-replenishment-crystal-orchestrator/requests/3cf44c20-61e3-4f45-9121-0949378bd090: HTTP request failed with status code: 500
Usage:
  snd algorithm cancel [flags]

Flags:
  -h, --help               help for cancel
  -i, --id string          Specify the Crystal Job ID
      --initiator string   Provide name or work email of the person cancelling the run
      --reason string      Specify reason for cancelling the job

Global Flags:
      --algorithm string            Specify the algorithm name
  -a, --auth-provider string        Specify the OAuth provider name (default "azuread")
      --custom-service-url string   Specify the service url (default "https://crystal.%s.sneaksanddata.com")
  -e, --env string                  Target environment (default "test")

Error:  failed to cancel run for algorithm store-auto-replenishment-crystal-orchestrator with run id 3cf44c20-61e3-4f45-9121-0949378bd090: error making request to https://crystal.test.sneaksanddata.com/algorithm/v1.2/cancel/store-auto-replenishment-crystal-orchestrator/requests/3cf44c20-61e3-4f45-9121-0949378bd090: HTTP request failed with status code: 500

Steps to reproduce the issue

1. 2. 3.

Describe the results you expected

No response

SnD CLI version you are using

v1.1.1

adelinag08 commented 6 months ago

@matt035343 seems that this run id does not exists in test? It's production in which case you need to use -e production flag.

I will improve the error message to not show a code but a message.

matt035343 commented 6 months ago

@adelinag08 ah yes, it is a production run.

But it seems like the cancellation does not change anything. What part of the checkpoint entry is changed after calling cancel?

adelinag08 commented 6 months ago

@matt035343

The cancel command deletes the job and changes the status to CANCELLED, do you experience other behaviour?

matt035343 commented 6 months ago

@adelinag08 I have tried cancelling 3cf44c20-61e3-4f45-9121-0949378bd090 multiple times, but nothing happens in the checkpoints table.

select * from crystal.checkpoints where id = '3cf44c20-61e3-4f45-9121-0949378bd090' and algorithm = 'store-auto-replenishment-crystal-orchestrator'

Also, in IAR I typically need to reset the tag somehow. I usually do this:

update crystal.checkpoints set tag = concat('mata-restart-',tag) where id = '3cf44c20-61e3-4f45-9121-0949378bd090' and algorithm = 'store-auto-replenishment-crystal-orchestrator'

Otherwise I cannot restart the algorithm, how do I do that from the snd client? From my conversations with George I understood it as it is part of the cancel feature. I see now that maybe it is not a good idea, but I still need the functionality to reset the tag. Is that a new issue?

matt035343 commented 6 months ago

@adelinag08 Since I have opened two new issues: https://github.com/SneaksAndData/snd-cli-go/issues/67 and https://github.com/SneaksAndData/snd-cli-go/issues/68, we can boil this issue down to providing a better error message