BeeswaxIO / beeswax-api

Beeswax API for custom bidders
Apache License 2.0
47 stars 68 forks source link

Beeswax batch Logs for wins, bids, and auctions do not reflect currently shipped logs. #31

Closed jfratzke closed 6 years ago

jfratzke commented 6 years ago

The field "matched_user_groups" was tacked onto each of these respective log models a few weeks ago with no PR or notification to downstream customers. Please ensure that a PR is created and that master accurately reflects the shipped model as we have no way of validating the incoming data and applying types due CSV format the logs are shipped in.

pswaminathan commented 6 years ago

Apologies! I'm sending a PR with updated headers shortly.

Note as well that for data type information, we keep a public spreadsheet of header definitions and data types here. This should help you identify the data types.

Additionally, our policy for log files has been that adding a new column is not considered a breaking change. You should configure your ingest processes to discard extra columns.

jfratzke commented 6 years ago

@pswaminathan, It's only a breaking change due to the lack of proper escaping of columns in the CSVs provided. Here's an example row we received that caused the pipeline to break:

,,,,auction1,2018-03-15 18:59:47.513,IAB5,,,,www.spanishdict.com,WEB,6090785,CAN,124933,CAN/BC,V7L,,PUBMATIC,,someip2,someip1,pm/800422,,Chrome,64,,,,,,,,-1,Chrome - Windows,NA,PC,true,Windows,7,,,pm/156687,,6899,beeswaxid1,,,,,,false,BANNER,49.3484993,-123.098,,,,,300,250,,,,,,IP,1521154787513276,"http://www.spanishdict.com/translate/""por supuesto, de nada""",,,"EXPANDABLE_AUTOMATIC,AUDIO_USER_INITIATED,ANNOYING,AUDIO_AUTO_PLAY,VIDEO_IN_BANNER_USER_INITIATED,VIDEO_IN_BANNER_AUTO_PLAY,WINDOWS_DIALOG_OR_ALERT_STYLE,PROVOCATIVE_OR_SUGGESTIVE,POP_UP,AD_CAN_EXPAND_LEFT,AD_CAN_EXPAND_RIGHT,AD_CAN_EXPAND_UP,ADCAN",B9A45442-5841-40D0-866E-CA2FBCEB8160,,"Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.186 Safari/537.36",0,USD,,,,,BEESWAX,,pm/155959,AUTH_DIRECT,

If you look at the URL, there are extra quotes that aren't escape properly. This row is therefore malformed. We can alleviate this by setting DROPMALFORMED to true, however when additional columns are added, all rows are then qualified as malformed.