nyurik / kibana-vega-vis

This Kibana plugin allows any data visualizations from Elastic Search and other data sources using Vega grammar. You can even create a visualization on top of an interactive map.
Apache License 2.0
134 stars 31 forks source link

[Question]: Dynamic Query DSL formatting using kibanaAddFilter() #96

Open tforest opened 5 years ago

tforest commented 5 years ago

Hello and thank you for your great work for helping including Vega into Kibana.

I am trying to make dynamic filtering on data based on a brush selection on a Heat Map representation. I am using Vega 4.3.0 with Kibana 7.1.1.

As requested for submitting Kibana/Vega-related issues, here is my Vega code: vega_heatmap.txt Full dataset contains 10,989 entries, so here is a short data sample, still fairly representative of the global structure: data_sample.txt

Here is what it looks like. Screenshot_2019-06-26 heatmap_vega_interactif_experimental - Kibana I successfully can apply a filter using kibanaAddFilter() over ordinal qualitative values using a match_phrase query, like so:

    {
      name: brushFilter
      update: {
        bool: {
          should: [
            {
              match_phrase: {gene_id: "1563032_PM_at"}
            }
            {
              match_phrase: {gene_id: "1556670_PM_at"}
            }
            {
              match_phrase: {gene_id: "1563108_PM_at"}
            }
          ]
          minimum_should_match: 1
        }
      }
    }

With the following behavior in Kibana: Capture_1561560814

But as you can see, for the moment, I used a static query expression (for testing), and what I want is to format a string dynamically from selection, which has to be adressed to kibanaAddFilter().

I can indeed access to the list of selected values according to X and Y axes projection using _data("brushstore")[0].intervals[0].extent and _data("brushstore")[0].intervals[1].extent respectively.

At first, I thought of using a flatten transformation to reformat the array accordingly to the proper ElasticSearch query structure, but I don't think Vega transformations are meant to be used that way. Aren't they supposed to be used during data importation (under the data[ ] structure) and after all data loading, through the distinct transform[ ] section ?

If this is not possible, I also thought of performing a transform on all data and then reduce the list by matching on the elements of the two arrays I mentioned earlier. However, this may be quiet burdening to handle for then reduce the selection...

So, the purpose of this issue is to ask for some suggestions to deal with this situation.

Thanks a lot !

ibizaman commented 2 years ago

Here's how I made it work. I tried to adapt what I did to your data set but there could be some mismatch. For example, I could be wrong on how to get to the correct field in a datum. But anyway, the general idea is good.

Btw I'm using https://vega.github.io/schema/vega/v5.json.

  1. Make the x axis labels clickable
 axes: [
   {
     scale: x
     orient: bottom
     grid: false
     title: Gene ID
     labelOverlap: true
     encode: {
       labels: {
+        name: "geneid_axis_label"
+        interactive: true
+        enter: {
+          cursor: {value: "pointer"}
+        }
         update: {
           angle: {value: 270}
           align: {value: "right"}
           baseline: {value: "middle"}
         }
       }
     }
     zindex: 1
   }
 ]

The name is needed because we'll be referencing this from a signal later on. The enter.cursor field is optional but a nice UI indication you can click on it IMO.

  1. Create a signal for when a label is clicked
 signals: [
+  {
+    name: geneid_clicked
+    value: ""
+    on: [
+      {
+        events: @geneid_axis_label:click
+        update: datum.value
+      }
+    ]
+  }
 ]
  1. Store the clicked labels in an array
 data: [
+  {
+    name: geneid_selected
+    on: [
+      {
+        trigger: geneid_clicked
+        insert: geneid_clicked.datum.geneid
+      }
+    ]
+  }
 ]

Note that you can end up with duplicates in the array named geneid_selected. It's not too bad but it would be nice if entries could be deduplicated. Also, there's probably a way to "unclick" a label but I didn't spend too much time on this. Instead, I'm doing the next step.

  1. Allow to clear the selected labels when clicking on an empty area
signals: [
+  {
+    name: selection_cleared
+    value: true
+    on: [
+      {
+        events: mouseup[!event.item]
+        update: "true"
+        force: true
+      }
+    ]
+  }
  {
    name: geneid_selected
    on: [
+      {
+        trigger: selection_cleared
+        remove: true
+      }
      {
        trigger: geneid_clicked
        insert: geneid_clicked.datum.geneid
      }
    ]
  }
]

selection_cleared happens when you click on something that's not a label.

  1. Show which labels are selected by coloring them
 axes: [
   {
     scale: x
     orient: bottom
     grid: false
     title: Gene ID
     labelOverlap: true
     encode: {
       labels: {
         name: "geneid_axis_label"
         interactive: true
         enter: {
           cursor: {value: "pointer"}
         }
         update: {
           angle: {value: 270}
           align: {value: "right"}
           baseline: {value: "middle"}
+          fill: [
+            {
+              test: length(data('geneid_selected')) > 0 && indata('geneid_selected', 'data', datum.value)
+              value: "blue"
+            }
+            {
+              value: "black"
+            }
+          ]
         }
+        hover: {
+          fill: {value: "blue"}
+        }
       }
     }
     zindex: 1
   }
 ]
  1. Prepare a second array for the Kibana filter

I didn't find a way to transform the existing geneid_selected array in the way needed for ES queries, so instead I prepare a second one.

 data: [
+  {
+    name: geneid_selected_ES_query
+    on: [
+      {
+        trigger: selection_cleared
+        remove: true
+      }
+      {
+        trigger: geneid_clicked
+        insert: '''
+        {
+          "match_phrase": {
+            "gene_id:": geneid_clicked.datum.geneid
+          }
+        }
+        '''
+      }
+    ]
+  }
 ]
  1. Add a text that will be used as a button to apply the Kibana filters
 marks: [
+  {
+    name: apply_filter
+    type: text
+    interactive: true
+    encode: {
+      enter: {
+        cursor: {value: "pointer"}
+        x: {value: -110}
+        dx: {value: -3}
+        y: {value: 13}
+        align: {value: "right"}
+        text: {value: "Apply Filter"}
+      }
+    }
+  }
 ]

Of course, the location of the text will need to be adjusted to your specific case. I ended up needing to add some padding to the root view.

  1. React to the apply_filter signal and call the kibanaAddFilter function
signals: [
+  {
+    name: apply_engine_filter_clicked
+    on: [
+      {
+        events: @apply_filter:click
+        update: '''
+        length(data('geneid_selected_ES_query')) == 0 ? null :
+        kibanaAddFilter({
+          'bool': {
+            'should': data('geneid_selected_ES_query'),
+            'minimum_should_match': 1
+          }
+        })
+        '''
+      }
+    ]
+  }
 ]

To recap: Make what you want to filter clickable, store the values in an array, add a button to set the Kibana filter.

This is easily extendable if you want to filter by the Y axis too. Or you can even make the datapoints selectable and filter by that if that makes sense for your dataset. You need to make a copy of nearly everything but it's doable.

One trick here is using the warn() function. You can wrap about any expression with it and it will display stuff in Kibana. It's super hacky and not nice to work with but it's sometimes the easiest way to debug.