snowplow / snowplow-golang-analytics-sdk

Golang Analytics SDK for working with Snowplow enriched events in cloud functions and other Go applications.
https://snowplowanalytics.com/
Apache License 2.0
5 stars 1 forks source link

Add analytics sdk 0.1.0 #2

Closed colmsnowplow closed 3 years ago

colmsnowplow commented 3 years ago

Some to-do's left as inline comments. I can move them to issues before release if we want to do that, or if we're happy to leave them in-line for early releases I'm happy with that also.

Design decisions documented in comments to make reviewing easier

A couple of things I'm still undecided upon:

I'll still be tinkering with things in a separate branch, will merge it into this PR if I have improvements ready to add, but otherwise I think this is good enough for a beta/prototype, and we can release without including those experiments (with a view to making improvements in a subsequent release).

paulboocock commented 3 years ago

Flicking through the recent changes on my phone, what happens when a Timestamp is missing in the output (or any other non-required field that could be null in tp2)?

As an example, let's say I send an event in which has no true_timestamp. Does the resulting json contain the key with the value of nil or is the entire key value pair missing in the output json?

What do the other analytics sdks do in this scenario? The two options are always have 131 json keys but some have null/nil values or only have the number of json keys which have values. I quite like the consistency of always have 131 keys so you can always guarantee you can read it but then that'll be a lot of nulls for many events.

colmsnowplow commented 3 years ago

I will confirm that this is actually what this sdk does (esp after changing the timestamp output), but the desired behaviour is that the key doesn't exist at all. This is what the other sdks produce. At least, this is the behaviour when the input value is empty.

IMO it's better to output a smaller object of like 20 keys than a large 131 key one with loads of empty values, but I take the point on the consistency. It's pretty easy to check for a key in Go at least with

value, ok := object["key"]

Perhaps including null fields should be an optional feature we consider, rather than a change to default behaviour?

paulboocock commented 3 years ago

That makes sense, I prefer the smaller output particularly when writing to other systems and thinking about storage and transfer. It's always easy to check for key existance. This was just a thought that popped into my head whilst reading through the changes as it differs from how you'd typically build a web API where you'd usually get a consistent response from a server but that's not what we're building here 😅

paulboocock commented 3 years ago

And one final comment, it'd be good to add a GitHub action which runs tests on pushes. You can probably steal most (all?) of the sql-runner one: https://github.com/snowplow/sql-runner/blob/master/.github/workflows/build.yml

colmsnowplow commented 3 years ago

OK GH action to test added. No makefile or dockerisation, that all feels like overkill but if we need it I can add it.

Also we now have a ParseEvent() method and the Get/To functions are now methods on the ParsedEvent type, which ParseEvent returns.

My only niggling uncertainty is that I was hoping to abstract away all the length checks, but since technically ParseEvent doesn't have to be called, there's no codified guarantee that the ParsedEvent object has a length of 131.

So, the Get* methods all check length manually, and mapifyGoodEvent() (which is called by all the others) does so too.

colmsnowplow commented 3 years ago

Additionally: Feels like code-wise we're close to looking good, so I will find some time to create a README and clean this up before release. Might want to hold off on a final pass/second reviewer until I've done that to save us some effort. :)

colmsnowplow commented 3 years ago

Yeah for sure. TBH I'd be more than happy to leave it released as beta until we're close to ready to release the stream-replicator integration.

colmsnowplow commented 3 years ago

Closing in favour of https://github.com/snowplow/snowplow-golang-analytics-sdk/pull/3