Closed vfedoseev closed 1 year ago
PBS-Java sets site.domain and site.publisher.domain:
site.domain: full domain. e.g. www.example.com. or sports.usatoday.com site.publisher.domain: 'rounded off' site.domain. e.g. example.com or usatoday.com
We'll review how these values are set and make sure it's in sync with Go
PBS-Go current logic:
if site.page is not set {
if http.referer is set and http.referer is a valid URL {
site.page = http.referer
}
}
if site.domain is not set {
if http.referer is set and http.referer is a valid URL {
site.domain = http.referer.host (i.e. the example.com portion of http://cool.example.com)
}
}
Note: we never set site.publisher.domain
. If it is specified in the original request though it will be passed through.
PBS-Go logic with proposed change:
if site.page is not set {
if http.referer is set and http.referer is a valid URL {
site.page = http.referer
}
}
If site.domain is not set {
if http.referer is set and http.referer is a valid URL {
site.domain = http.referer.host (i.e. the example.com portion of http://cool.example.com)
} else if http.referer is not set and site.page is set and site.page is a valid URL {
site.domain = site.page.host (i.e. the example.com portion of http://cool.example.com)
}
}
Note: for http.referer
or site.page
to be considered valid when trying to set domain
, they must either be an absolute ([scheme]://
) or relative (//
) path.
Also, site.page
is required to be set while site.domain
is not.
I've opened an internal ticket to have the PBS-Java team investigate and align Java to this approach.
FWIW, happened to run a test comparing PBS-Go and PBS-Java output. Here's the outcome regarding domains.
Referer: https://www.britannica.com/event/Seven-Years-War/Preliminary-negotiations-and-hostilities-in-the-colonies Incoming site object:
{
"site": {
"publisher": {
"id": "9262"
},
"page": "https://www.britannica.com/event/Seven-Years-War/Preliminary-negotiations-and-hostilities-in-the-colonies"
}
}
I believe the PBS-Java results are correct:
"site": {
"domain": "www.britannica.com",
"page": "https://www.britannica.com/event/Seven-Years-War/Preliminary-negotiations-and-hostilities-in-the-colonies",
"publisher": {
"id": "9262",
"domain": "britannica.com"
},
"ext": {
"amp": 0
}
},
PBS-Go hasn't been updated:
"site": {
"domain": "britannica.com",
"page": "https://www.britannica.com/event/Seven-Years-War/Preliminary-negotiations-and-hostilities-in-the-colonies",
"publisher": {
"id": "9262"
},
"ext": {
"amp": 0
}
},
👍 The original problem description has been implemented but PBS-Go still needs to set site.publisher.domain
if it's not set.
@AlexBVolcy Thanks for working on this. I noticed that PBS Go currently sets Site.Domain
as the highest level domain i.e. if the host is foo.bar.baz.com
, it will set it as baz.com
. However, based on the openrtb2 documentation, Site.Domain
is supposed to be the full domain of the website i.e. for the above example, foo.bar.baz.com
and Site.Publisher.Domain
is supposed to be the highest level domain i.e for the above example, just baz.com
. This is how PBS Java works as well and you can see that from the example @bretg posted above.
Hi,
While checking the Prebid server integrations on several sites, we've noticed an issue with 'no-referrer' meta tag.
The /auction endpoint is currently checking the following values in bid request:
If any of them is not presented, then the Http.Referer header is parsed and the results are passed to the 'Site' object: https://github.com/prebid/prebid-server/blob/master/endpoints/openrtb2/auction.go#L1525
Sometimes the publishers use the following meta tag on the sites - it prevents browsers from sending Http.Referer to any requests:
<meta name="referrer" content="no-referrer">
If the 'site.domain' is also missing in the bid request payload, then it won't be parsed and passed to the adapters. Example request below:
So, the final request to /auction enpoint:
We suggest to parse the Site.Page - if Http.Referer is empty and Site.Page is available - to extract the Site.Domain value.