ngulam-ai / Sherlock

0 stars 0 forks source link

Conversion Events #28

Closed ngulamai closed 6 years ago

ngulamai commented 6 years ago

We have some questions regarding the Conversion event

When calling the endpoint, we call it like this (example)

https://sherlock-184721.appspot.com/collect?v=1&t=event&tid=UA-109728168-5&ec=Conversion&ea=Conversion&ta=$$ADVERTISER$$&tr=$$GROSS$$&pa=purchase&pr1id=$$CAMPAIGN_ID$$&pr1nm=$$CAMPAIGN$$&pr1ca=$$BANNER_NAME$$&pr1br=$$ADVERTISER$$&pr1pr=$$GROSS$$&pr1qt=1&cn=$$PUBLISHER_NAME$$|$$PLACEMENT_NAME$$&cs=$$PUBLISHER_NAME$$&cm=$$ZONE_NAME$$&cp.Event_ID=$$CUSTOM_PARAM(Event_ID)$$&uid=$$REMOTE_COUNTRY_NAME$$;$$REMOTE_REGION$$;$$REMOTE_CITY$$;$$MOBILE_CARRIER$$;$$DEVICE_VENDOR$$;$$OS_NAME$$;$$OS_VERSION$$;$$CLIENT_USER_AGENT$$;$$REMOTE_IP$$;$$CUSTOM_PARAM(cd11)$$;$$CUSTOM_PARAM(cd7)$$;$$CUSTOM_PARAM(cd10)$$&ti=$$TRANSACTION_ID$$

However we have seen that in Bigquery there are multiple other parameters that we did not send in the payload

tmp_raw_request_json != '{\"date\":\"20180307\",\"dm\":\"Unknown\",\"bf\":\"Asynchronous HTTP Client\",\"pr1qt\":\"1\",\"ua\":\"AHC\/2.1\",\"tid\":\"UA-98835143-2\",\"cd96\":\"2018\",\"uid\":\";;;null;null;null;null;;;null;null;null\",\"cd94\":\"Wednesday\",\"pr1ca\":\"Mobile Content\",\"cd95\":\"03\",\"pr1id\":\"321\",\"hour\":\"03\",\"cd92\":\"03:55:19.746\",\"cd93\":\"07\",\"cd91\":\"Europe\/Madrid\",\"bv\":\"0.0\",\"uip\":\"199.80.53.212\",\"of\":\"Unknown\",\"ea\":\"Conversion\",\"ec\":\"Conversion\",\"pr1br\":\"Kimia\",\"hitId\":\"81552b34-8739-4363-8181-091345a90eaa\",\"cm\":\"Blind\",\"cn\":\"Mondia|Smartlink (Mainstream Adult)\",\"ov\":\"Unknown\",\"ta\":\"Kimia\",\"pr1pr\":\"0.0312\",\"minute\":\"55\",\"pr1nm\":\"97885 - Tu Horoscopo\",\"cs\":\"Mondia\",\"pa\":\"purchase\",\"istc\":false,\"db\":\"Unknown\",\"dc\":\"Unknown\",\"t\":\"event\",\"ti\":\"h0mg7uqw0fby\",\"v\":\"1\",\"time\":\"1520394919746\",\"tr\":\"0.0312\",\"isb\":true}'

Can you please help us understand if you are adding additional pameters to the payload before inserting in Bigqueryt?

ngulamai commented 6 years ago

Another example

where tmp_raw_request_json != '{\"date\":\"20180301\",\"dm\":\"Unknown\",\"bf\":\"Asynchronous HTTP Client\",\"pr1qt\":\"1\",\"ua\":\"AHC\/2.1\",\"tid\":\"UA-98835143-2\",\"cd96\":\"2018\",\"uid\":\";;;null;null;null;null;;;null;null;null\",\"cd94\":\"Thursday\",\"pr1ca\":\"Smartlink\",\"cd95\":\"03\",\"pr1id\":\"235\",\"hour\":\"02\",\"cd92\":\"02:13:17.918\",\"cd93\":\"01\",\"cd91\":\"Europe\/Madrid\",\"bv\":\"0.0\",\"uip\":\"199.80.53.212\",\"of\":\"Unknown\",\"ea\":\"Conversion\",\"ec\":\"Conversion\",\"pr1br\":\"Mobidea (AD)\",\"hitId\":\"18878b34-b7b7-48dd-9da9-b52f87796dec\",\"cm\":\"Blind\",\"cn\":\"Mondia|Smartlink (Mainstream Adult)\",\"ov\":\"Unknown\",\"ta\":\"Mobidea (AD)\",\"pr1pr\":\"0.022966538573532\",\"minute\":\"13\",\"pr1nm\":\"All inclusive Mainstream_CPI\",\"cs\":\"Mondia\",\"pa\":\"purchase\",\"istc\":false,\"db\":\"Unknown\",\"dc\":\"Unknown\",\"t\":\"event\",\"ti\":\"eg9df2mwiss9\",\"v\":\"1\",\"time\":\"1519870397918\",\"tr\":\"0.022966538573532\",\"isb\":true}'

https://sherlock-184721.appspot.com/collect?v=1&t=event&tid=UA-98835143-2&ec=Conversion&ea=Conversion&ta=$$ADVERTISER$$&tr=$$GROSS$$&pa=purchase&pr1id=$$CAMPAIGN_ID$$&pr1nm=$$CAMPAIGN$$&pr1ca=$$BANNER_NAME$$&pr1br=$$ADVERTISER$$&pr1pr=$$GROSS$$&pr1qt=1&cn=$$PUBLISHER_NAME$$|$$PLACEMENT_NAME$$&cs=$$PUBLISHER_NAME$$&cm=$$ZONE_NAME$$&cp.Event_ID=$$CUSTOM_PARAM(Event_ID)$$&uid=$$REMOTE_COUNTRY_NAME$$;$$REMOTE_REGION$$;$$REMOTE_CITY$$;$$MOBILE_CARRIER$$;$$DEVICE_VENDOR$$;$$OS_NAME$$;$$OS_VERSION$$;$$CLIENT_USER_AGENT$$;$$REMOTE_IP$$;$$CUSTOM_PARAM(cd11)$$;$$CUSTOM_PARAM(cd7)$$;$$CUSTOM_PARAM(cd10)$$&ti=$$TRANSACTION_ID$$

akolchin-MM commented 6 years ago

Can you please help us understand if you are adding additional pameters to the payload before inserting in Bigqueryt?

The general answer - Yes. We add some parameters mostly related to time and user agent. If you interested, I can provide detailed explanation, but I need time for it.

But in the provided examples I don't see any value in you calls, which allows uniquely identify matched records from BigQuery table. So I don't sure that you correctly found matched records. You can try to add any (not used yet) parameter to your call to the /collect URL and then find matched record by this parameter (it will be copied to the tmp_raw_request_json column as is).

I can do it for you tomorrow and provide the results with more explanations. Today I need to do some urgent things, sorry.

ngulamai commented 6 years ago

Hi Alexander Thank you for your feedback The reason for my question was the fact that we found that some of the datapoints that you are filling up are incorrect (e.g.: ua) As mentioned before, we agreed time ago that instead of filling up the missing parameters from the conversion events by looking up the AdClick event and looking up this values from there, we would leave the other teams to do so

Can you please confirm whereas you still suggest not to lookup the missing datapoints in the Conversion Event from the previous AdClick event? If that is the case, then we need to remove the datapoints such as user agent from the conversion event

Thanks Adrian

On 12 Mar 2018, at 13:38, akolchin-MM notifications@github.com wrote:

Can you please help us understand if you are adding additional pameters to the payload before inserting in Bigqueryt?

The general answer - Yes. We add some parameters mostly related to time and user agent. If you interested, I can provide detailed explanation, but I need time for it.

But in the provided examples I don't see any value in you calls, which allows uniquely identify matched records from BigQuery table. So I don't sure that you correctly found matched records. You can try to add any (not used yet) parameter to your call to the /collect URL and then find matched record by this parameter (it will be copied to the tmp_raw_request_json column as is).

I can do it for you tomorrow and provide the results with more explanations. Today I need to do some urgent things, sorry.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/ngulam-ai/Sherlock/issues/28#issuecomment-372295466, or mute the thread https://github.com/notifications/unsubscribe-auth/AfvWTeDNKrC6bf7YJOj1PrXjZQfrEEJ8ks5tdmw5gaJpZM4SloUj.

akolchin-MM commented 6 years ago

Hello Adrian,

Can you please confirm whereas you still suggest not to lookup the missing datapoints in the Conversion Event from the previous AdClick event?

No, we don't look up any values from the previous Hit records.

Nevertheless, if "ua" parameter omitted in the request, we fill it with the "User-Agent" value from HTTP request. I suppose that it has a sense for most requests and we discussed it.

// --- userAgent & IP
        tryToPutOnce(reqJson, "ua", req.getHeader("User-Agent"));

source

I don't know how it can confuse your other teams - they can just ignore this values if it is not relevant for such requests.

The other workaround can be to pass "ua" parameter to the /collect URl with some fake or stub value - we don't replace it if it exists.

Please also keep in mind that logic (and thus the time of processing) of the /collect servlet should be as simple as only possible because it is the bottleneck of pipeline in some sense. So I prefer not to add here additional checks and more complex logic if its possible.

The reason for my question was the fact that we found that some of the datapoints that you are filling up are incorrect (e.g.: ua)

It cold be more efficient if you provide some certain examples with explanations which values seem incorrect and why your team think so.

In your example:

https://sherlock-184721.appspot.com/collect?v=1&t=event&tid=UA-98835143-2&ec=Conversion&ea=Conversion&ta=$$ADVERTISER$$&tr=$$GROSS$$&pa=purchase&pr1id=$$CAMPAIGN_ID$$&pr1nm=$$CAMPAIGN$$&pr1ca=$$BANNER_NAME$$&pr1br=$$ADVERTISER$$&pr1pr=$$GROSS$$&pr1qt=1&cn=$$PUBLISHER_NAME$$|$$PLACEMENT_NAME$$&cs=$$PUBLISHER_NAME$$&cm=$$ZONE_NAME$$&cp.Event_ID=$$CUSTOM_PARAM(Event_ID)$$&uid=$$REMOTE_COUNTRY_NAME$$;$$REMOTE_REGION$$;$$REMOTE_CITY$$;$$MOBILE_CARRIER$$;$$DEVICE_VENDOR$$;$$OS_NAME$$;$$OS_VERSION$$;$$CLIENT_USER_AGENT$$;$$REMOTE_IP$$;$$CUSTOM_PARAM(cd11)$$;$$CUSTOM_PARAM(cd7)$$;$$CUSTOM_PARAM(cd10)$$&ti=$$TRANSACTION_ID$$

In your example practically all passed parameters filed by placeholders like "$$...$$" - is it the real call to the /collect URL or just some template of it?

I mean that in order to really find matching Hit record from BigQuery and investigate it I need to know the real values from your call to the /collect URL. I need some certain examples with real values passed with parameters.

Please provide more details and we will find some suitable solution.

Thank you, Alexander Kolchin