Bre77 / TA-googlebigquery

https://splunkbase.splunk.com/app/5692/
4 stars 2 forks source link

Improve README. #25

Closed daaain closed 2 years ago

daaain commented 2 years ago

Fixes some small typos and adds a few clarifying bits I had trouble with.

Bre77 commented 2 years ago

So I thought TIMESTAMP fields would get handled correctly by converting them to Strings: checkpoint_next = max(checkpoint_next,str(row[checkpoint_field]))

But obviously you have an experience where that's not true, so I'll take your word for it.

Thanks for your contribution.

daaain commented 2 years ago

So I thought TIMESTAMP fields would get handled correctly by converting them to Strings: checkpoint_next = max(checkpoint_next,str(row[checkpoint_field]))

Yes, it can be made to work, but needs a fair bit of defensive coding to replace the default "0".

SELECT * FROM `table`
WHERE timestamp_column > IFNULL(SAFE_CAST("%checkpoint%" as TIMESTAMP), TIMESTAMP_MILLIS(0)) 

See https://github.com/Bre77/TA-googlebigquery/issues/24

daaain commented 2 years ago

Actually, maybe a better option would be not applying the checkpoint in the first run when there isn't a file yet? And then saving whatever value came back from the first successful query. I think that could be much more robust and prevent these errors where the default checkpoint value fails the query (which happened to me a lot).

I can submit a PR for that if you agree?

Bre77 commented 2 years ago

Oh right, so rather than 0 remove from the query. That makes sense. I'm cool if you want to do a PR, otherwise I'll try do it today.