influxdata / chronograf

Open source monitoring and visualization UI for the TICK stack
https://www.influxdata.com/time-series-platform/chronograf/
Other
1.51k stars 258 forks source link

Saving TICK scripts in Chronograf makes human task name disapear #4304

Closed stefanhorning closed 3 years ago

stefanhorning commented 6 years ago

I am using Chronograf 1.6.1 and when I edit existing tick scripts (initially created by the task GUI of Chronograf) the human friendly name is lost and instead in the list view only the internal TICK id is visible with this kind of format: chronograf-v1-30ac70ed-e070-4919-b21b-e375ebf81a1b.

Steps to reproduce:

  1. Click "Build alert rule" in the "Manage Task" view and give the rule a nice name and save it
  2. Back on the "Manage Tasks" page scroll down to the "TICKscripts" list
  3. Open that same rule in the TICK editor, make a change to it and save again.
  4. Back in the list view the rule has a) dissapeared from the "Alert Rules" list and b) lost the human readable name in the "TICKscript" list.

If I remember correctly this wasn't always like this, as I prevously was able to make small adjustments (like changing the query conditions) to rules using the TICK editor without changing the listing or the names.

stefanhorning commented 6 years ago

Aparently this doesn't happen on any change. I hovewer could reproduce it reliably when changing tagging conditions in a query.

stefanhorning commented 6 years ago

Can only reproduce the error when I change this line

var whereFilter = lambda: ("some_tag" == 'foo')

to

var whereFilter = lambda: ("some_tag" == 'foo') AND ("server" != 'https://some-domain.com/health_check')

Working with tags here to exclude on server from http_response checks in the alert.

russorat commented 6 years ago

@stefanhorning behind the scenes, we have to store the human name outside of kapa due to some limitations in the api. we will investigate this.

jregovic commented 5 years ago

Any updates on this?

We still see this in 1.7.7

stale[bot] commented 5 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

fomenta commented 5 years ago

Any updates on this?

stale[bot] commented 4 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] commented 4 years ago

This issue has been automatically closed because it has not had recent activity. Feel free to reopen if this issue is still important to you. Thank you for your contributions.

jpm-izinto commented 4 years ago

I'm also experiencing this. It makes managing tasks in Chronograf very difficult as I don't know from the name what the script does.

HeikoOnnebrink commented 4 years ago

Same here with latest version .. once we add a string function in tick editor name swaps to v1-uuid style and disappears from the alert list. having no human readable script names is a no-go as we loose control

russorat commented 4 years ago

The UI of chronograf pulls the name for the UI from the var name variable in the TICKscript, and defaults to the uuid if it is not present.

Kapture 2020-09-24 at 11 45 43

HeikoOnnebrink commented 4 years ago

@russorat Thx for reopening this issue 👍

We always set the var name in our tick scripts, in this example we set the name 'WAL SHIPPING DELAY'

image

But script shows up in v1-UUID style and is not present in alert rules list

image

Once I remove the substring function from where condition it saves it with the name as defined in var name :

change: var whereFilter = lambda: (strContains("directory", 'pg_wal') OR strContains("directory", 'pg_xlog')) to: var whereFilter = lambda: ("directory" == '/some/folder/pg_wal' OR "directory" == 'some/folder/pg_xlog')

image
HeikoOnnebrink commented 4 years ago

Here a copy of the full tick script in case it would be required to analyse the issue:

var db = 'postgresql'

var rp = 'default'

var measurement = 'filecount'

var groupBy = ['ID', 'directory']

var whereFilter = lambda: (strContains("directory", 'pg_wal') OR strContains("directory", 'pg_xlog'))

var name = 'WAL SHIPPING DELAY'

var idVar = name + '-{{.Group}}'

var message = '$db-uuid$ {{ index .Tags "ID" }}
$message$ THE WAL SHIPPING DELAY IS MORE THAN {{ index .Fields "value"}}B
'

var messageok = '$db-uuid$ {{ index .Tags "ID" }}
$message$ THE WAL SHIPPING DELAY WENT BACK TO NORMAL
'

var idTag = 'alertID'

var levelTag = 'level'

var messageField = 'message'

var durationField = 'duration'

var outputDB = 'chronograf'

var outputRP = 'autogen'

var outputMeasurement = 'alerts'

var triggerType = 'threshold'

var warn = 4000000000

var crit = 6000000000

var data = stream
    |from()
        .database(db)
        .retentionPolicy(rp)
        .measurement(measurement)
        .groupBy(groupBy)
        .where(whereFilter)
    |eval(lambda: "size_bytes")
        .as('value')

var trigger = data
    |alert()
        .warn(lambda: "value" > warn)
        .crit(lambda: "value" > crit)
        .message('{{ if eq .Level "OK" }} ' + messageok + ' {{ else }} ' + message + ' {{ end }}')
        .id(idVar)
        .idTag(idTag)
        .levelTag(levelTag)
        .messageField(messageField)
        .durationField(durationField)
        .stateChangesOnly()
        .post()
        .endpoint('rdbalert')

trigger
    |eval(lambda: "value")
        .as('value')
        .keep()
    |influxDBOut()
        .create()
        .database(outputDB)
        .retentionPolicy(outputRP)
        .measurement(outputMeasurement)
        .tag('alertName', name)
        .tag('triggerType', triggerType)

trigger
    |httpOut('output')
HeikoOnnebrink commented 4 years ago

and here the exact version we are using for all components (all deployed as docker container)

kapacitor:1.5.6 chronograf:1.8.2 influxdb:1.8.2

afausti commented 4 years ago

@russorat we are seeing this issue here too (Chronograf 1.8.4).

I cannot reproduce it consistently but I noticed that if you create a new TICKscript using the "+ Write TICKscript" button (without going through the "Alert Rule builder") the name that is displayed is the name that you typed in the text box - which is referred as Task ID (it does not use the name variable).

And Task ID must follow these rules:

Task ID must contain only letters, numbers, '-', '.' and '_'. "THIS IS THE NAME I TYPED" 

However, if you use the Alert rule builder, it sets the variable name and uses that as display name. At this point I think a task ID like chronograf-v1-30ac70ed-e070-4919-b21b-e375ebf81a1b is automatically generated. That internal name happens to be used as display name sometimes (e.g. after you edit the script and save it) but not always.

Hope this provides more evidence to where the problem is...

DarthSidious-DBA commented 4 years ago

@russorat I am also facing the same problem (1.8.2).

It is impossible to verify the different alerts (more than 20) when having this kind of tick names. That´s frustrating. I guess it changed from readable to other notation when using a string funtion in WHERE-Clause.

Hope it will be fixed soon. Thanks in advance.

tryadelion commented 4 years ago

still reproduceable.

russorat commented 4 years ago

we have this scheduled for the next maintenance release which should be out by the end of Nov, maybe sooner.

stefanhorning commented 3 years ago

After not running into this for quite some time, I can now reproduce this again with Chronograf Version: 1.8.5.

Was somehow stuck on this version, but just realized that's because the newer versions don't make it to the Ubuntu apt repo anymore.

sranka commented 3 years ago

@stefanhorning ... this is fixed since 1.8.8

a-vogel-tappert commented 3 years ago

Now works fine for me with version 1.8.8 of Chronograf