splunk / splunk-tableau-wdc

Splunk Tableau Web Data Connector (WDC) Example
Apache License 2.0
20 stars 13 forks source link

Allow incremental extract on user specified field #7

Closed hangrymuppet closed 4 years ago

hangrymuppet commented 5 years ago

The connector today doesn't support Tableau incremental extracts from Splunk. From the way I see it, this is the only functionality missing for the splunk-tableau-wdc from supporting all the user scenarios for using the web data connector as a replacement of the Splunk ODBC driver.

To be able to use incremental extract, the connector however needs to know the field that needs to be incremented on; which is most use cases is _time.

I tried hard coding it in a fork of the project but it did not work. Not being very familiar with javascript, I am not sure if I am doing something wrong.

My modifications were made in src/splunk-wdc.js:

var tableInfo = {
        id: "splunkFeed",
        alias: cName, // "Splunk Feed Test",
+ ----- incrementColumnId: "_time",
        columns: cols
};
mayurah commented 5 years ago

@bajajh - Thanks for bringing this up!

Wanted to know your thoughts on these:

Shall I also add an option to keep this as an option in UI, where user can select whether to have incremental update or not? Do you think this increment feature is useful for all Tableau user or few?

I was also wondering if we can use other unique field than _time, as _time (with seconds precision) in splunk might have more than one event.

hangrymuppet commented 5 years ago

Thanks @mayurah for following up.

I think it should definitely be a user option in UI with a form field to specify which Splunk field to increment on. If it were implemented smartly, since Splunk already knows which fields are available in the query, the tool could potentially query the list of fields and allow the user to select from a drop down.

Tableau doesn't care which field is used as long as the type can be monotonically increasing. It performs no validation that it truly is.

From observation, I believe that Tableau modifies the query to append | where _time > max(_time). The result of max(_time) is obtained from the extract and substituted before the query is sent to Splunk. And I am using _time here as an example... it can be any field that satisfies the requirement.

hangrymuppet commented 5 years ago

Hi @mayurah,

Do you have an update on this request?

Thanks!

actionpotato commented 5 years ago

I too would be extremely interested in this request.

Thanks!

mayurah commented 5 years ago

We will look into it sometime this weekend, feel free to PR!

mayurah commented 5 years ago

Commit c7c2e07 applies the customer field to be defined by user for incremental refresh.

I will mark this as closed after a week, unless there's any potential issue/concerns.

hangrymuppet commented 5 years ago

Thanks @mayurah

Did you get a chance to test incremental refresh with Tableau Desktop / Server on an extract?

I'll have a go at it this week and post my results.

mayurah commented 5 years ago

Not really, this changes only adds field in UI and tableInfo schema. Validation for different use case/scenarios is something yet to be done.

Hope to see tests for this from you folks!

actionpotato commented 5 years ago

This is looking really good so far. Still having our users test it. One caveat (not really an issue) is that I found that I needed to rename fields to have the WDC accept them.

Here is the test search I was using:

index=_audit action=search search=* user=* NOT user=splunk-system-user  earliest=-1h 
| rex field=search "index\s*=\s*\"*(?<indexname>[^\s\"]+)"  
| search indexname="*" 
| stats count by indexname user 
| rename count as searches
| stats list(indexname) by user searches

Because "stats list(indexname)" returns with brackets, I needed to modify the search as follows:

  | rename list(indexname) as indexname

Not something I would consider an issue, but something for people who are migrating current searches to be aware of.

mayurah commented 4 years ago

Thanks @actionpotato , I will create an issue tagged as feature request to have auto-rename in place!

I will closed this as we've incremental issue fixed.