Closed ngulamai closed 6 years ago
Another example
where tmp_raw_request_json != '{\"date\":\"20180301\",\"dm\":\"Unknown\",\"bf\":\"Asynchronous HTTP Client\",\"pr1qt\":\"1\",\"ua\":\"AHC\/2.1\",\"tid\":\"UA-98835143-2\",\"cd96\":\"2018\",\"uid\":\";;;null;null;null;null;;;null;null;null\",\"cd94\":\"Thursday\",\"pr1ca\":\"Smartlink\",\"cd95\":\"03\",\"pr1id\":\"235\",\"hour\":\"02\",\"cd92\":\"02:13:17.918\",\"cd93\":\"01\",\"cd91\":\"Europe\/Madrid\",\"bv\":\"0.0\",\"uip\":\"199.80.53.212\",\"of\":\"Unknown\",\"ea\":\"Conversion\",\"ec\":\"Conversion\",\"pr1br\":\"Mobidea (AD)\",\"hitId\":\"18878b34-b7b7-48dd-9da9-b52f87796dec\",\"cm\":\"Blind\",\"cn\":\"Mondia|Smartlink (Mainstream Adult)\",\"ov\":\"Unknown\",\"ta\":\"Mobidea (AD)\",\"pr1pr\":\"0.022966538573532\",\"minute\":\"13\",\"pr1nm\":\"All inclusive Mainstream_CPI\",\"cs\":\"Mondia\",\"pa\":\"purchase\",\"istc\":false,\"db\":\"Unknown\",\"dc\":\"Unknown\",\"t\":\"event\",\"ti\":\"eg9df2mwiss9\",\"v\":\"1\",\"time\":\"1519870397918\",\"tr\":\"0.022966538573532\",\"isb\":true}'
Can you please help us understand if you are adding additional pameters to the payload before inserting in Bigqueryt?
The general answer - Yes. We add some parameters mostly related to time and user agent. If you interested, I can provide detailed explanation, but I need time for it.
But in the provided examples I don't see any value in you calls, which allows uniquely identify matched records from BigQuery table. So I don't sure that you correctly found matched records. You can try to add any (not used yet) parameter to your call to the /collect URL and then find matched record by this parameter (it will be copied to the tmp_raw_request_json column as is).
I can do it for you tomorrow and provide the results with more explanations. Today I need to do some urgent things, sorry.
Hi Alexander Thank you for your feedback The reason for my question was the fact that we found that some of the datapoints that you are filling up are incorrect (e.g.: ua) As mentioned before, we agreed time ago that instead of filling up the missing parameters from the conversion events by looking up the AdClick event and looking up this values from there, we would leave the other teams to do so
Can you please confirm whereas you still suggest not to lookup the missing datapoints in the Conversion Event from the previous AdClick event? If that is the case, then we need to remove the datapoints such as user agent from the conversion event
Thanks Adrian
On 12 Mar 2018, at 13:38, akolchin-MM notifications@github.com wrote:
Can you please help us understand if you are adding additional pameters to the payload before inserting in Bigqueryt?
The general answer - Yes. We add some parameters mostly related to time and user agent. If you interested, I can provide detailed explanation, but I need time for it.
But in the provided examples I don't see any value in you calls, which allows uniquely identify matched records from BigQuery table. So I don't sure that you correctly found matched records. You can try to add any (not used yet) parameter to your call to the /collect URL and then find matched record by this parameter (it will be copied to the tmp_raw_request_json column as is).
I can do it for you tomorrow and provide the results with more explanations. Today I need to do some urgent things, sorry.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/ngulam-ai/Sherlock/issues/28#issuecomment-372295466, or mute the thread https://github.com/notifications/unsubscribe-auth/AfvWTeDNKrC6bf7YJOj1PrXjZQfrEEJ8ks5tdmw5gaJpZM4SloUj.
Hello Adrian,
Can you please confirm whereas you still suggest not to lookup the missing datapoints in the Conversion Event from the previous AdClick event?
No, we don't look up any values from the previous Hit records.
Nevertheless, if "ua" parameter omitted in the request, we fill it with the "User-Agent" value from HTTP request. I suppose that it has a sense for most requests and we discussed it.
// --- userAgent & IP
tryToPutOnce(reqJson, "ua", req.getHeader("User-Agent"));
I don't know how it can confuse your other teams - they can just ignore this values if it is not relevant for such requests.
The other workaround can be to pass "ua" parameter to the /collect
URl with some fake or stub value - we don't replace it if it exists.
Please also keep in mind that logic (and thus the time of processing) of the /collect
servlet should be as simple as only possible because it is the bottleneck of pipeline in some sense. So I prefer not to add here additional checks and more complex logic if its possible.
The reason for my question was the fact that we found that some of the datapoints that you are filling up are incorrect (e.g.: ua)
It cold be more efficient if you provide some certain examples with explanations which values seem incorrect and why your team think so.
In your example:
In your example practically all passed parameters filed by placeholders like "$$...$$
" - is it the real call to the /collect
URL or just some template of it?
I mean that in order to really find matching Hit record from BigQuery and investigate it I need to know the real values from your call to the /collect
URL.
I need some certain examples with real values passed with parameters.
Please provide more details and we will find some suitable solution.
Thank you, Alexander Kolchin
We have some questions regarding the Conversion event
When calling the endpoint, we call it like this (example)
https://sherlock-184721.appspot.com/collect?v=1&t=event&tid=UA-109728168-5&ec=Conversion&ea=Conversion&ta=$$ADVERTISER$$&tr=$$GROSS$$&pa=purchase&pr1id=$$CAMPAIGN_ID$$&pr1nm=$$CAMPAIGN$$&pr1ca=$$BANNER_NAME$$&pr1br=$$ADVERTISER$$&pr1pr=$$GROSS$$&pr1qt=1&cn=$$PUBLISHER_NAME$$|$$PLACEMENT_NAME$$&cs=$$PUBLISHER_NAME$$&cm=$$ZONE_NAME$$&cp.Event_ID=$$CUSTOM_PARAM(Event_ID)$$&uid=$$REMOTE_COUNTRY_NAME$$;$$REMOTE_REGION$$;$$REMOTE_CITY$$;$$MOBILE_CARRIER$$;$$DEVICE_VENDOR$$;$$OS_NAME$$;$$OS_VERSION$$;$$CLIENT_USER_AGENT$$;$$REMOTE_IP$$;$$CUSTOM_PARAM(cd11)$$;$$CUSTOM_PARAM(cd7)$$;$$CUSTOM_PARAM(cd10)$$&ti=$$TRANSACTION_ID$$
However we have seen that in Bigquery there are multiple other parameters that we did not send in the payload
tmp_raw_request_json != '{\"date\":\"20180307\",\"dm\":\"Unknown\",\"bf\":\"Asynchronous HTTP Client\",\"pr1qt\":\"1\",\"ua\":\"AHC\/2.1\",\"tid\":\"UA-98835143-2\",\"cd96\":\"2018\",\"uid\":\";;;null;null;null;null;;;null;null;null\",\"cd94\":\"Wednesday\",\"pr1ca\":\"Mobile Content\",\"cd95\":\"03\",\"pr1id\":\"321\",\"hour\":\"03\",\"cd92\":\"03:55:19.746\",\"cd93\":\"07\",\"cd91\":\"Europe\/Madrid\",\"bv\":\"0.0\",\"uip\":\"199.80.53.212\",\"of\":\"Unknown\",\"ea\":\"Conversion\",\"ec\":\"Conversion\",\"pr1br\":\"Kimia\",\"hitId\":\"81552b34-8739-4363-8181-091345a90eaa\",\"cm\":\"Blind\",\"cn\":\"Mondia|Smartlink (Mainstream Adult)\",\"ov\":\"Unknown\",\"ta\":\"Kimia\",\"pr1pr\":\"0.0312\",\"minute\":\"55\",\"pr1nm\":\"97885 - Tu Horoscopo\",\"cs\":\"Mondia\",\"pa\":\"purchase\",\"istc\":false,\"db\":\"Unknown\",\"dc\":\"Unknown\",\"t\":\"event\",\"ti\":\"h0mg7uqw0fby\",\"v\":\"1\",\"time\":\"1520394919746\",\"tr\":\"0.0312\",\"isb\":true}'
Can you please help us understand if you are adding additional pameters to the payload before inserting in Bigqueryt?