divolte / divolte-collector

Divolte Collector
https://divolte.io/
Apache License 2.0
283 stars 77 forks source link

Using 204 instead of 200 #107

Open cbsmith opened 8 years ago

cbsmith commented 8 years ago

To save bandwidth/serverload/etc. as well as just generally be more semantically correct, divolte should return a 204 response code rather than a 1x1 pixel GIF. At the very least, this should be a configurable option.

friso commented 8 years ago

I find it hard to estimate the possible effects of changing the response. I'd expect browsers not to have any issues with this, but I'm not sure. At this point it seems that all major trackers (including GA) use a HTTP 200 1x1 pixel response instead of HTTP 204, but I haven't been able to figure out why.

cbsmith commented 8 years ago

Google does do 204's, both with DoubleClick and just on their search home page (when I go to the Google home page it actually calls out to https://www.google.com/gen_204 ;-), as does ScoreCard Research, Krux, Addoox, NavEgg, LiveRail, etc.

It's actually pretty broadly employed at least in the DMP/HTTP data platform industry. When it isn't used, it is generally because a) someone didn't think about what the right response code would be, b) 204's didn't make as sense back in the old days when the response actually had to be rendered, c) misguided concerns about REST client frameworks that expect a 200 and content/content-type.

In divolte's case, it is controlling both sides of the protocol, so I would think it'd make sense for 204 to be the default behaviour, but I'll go for just having it as an option.

friso commented 8 years ago

When trying a site with GA, I am seeing this: image

I'm not against using a 204, but I'd like to research the situation a bit more. It wouldn't surprise me if the client side actually makes the decision on which endpoint to call. For example, some trackers will use a POST request through XHR if available and GET otherwise.

Forcing it all onto a 204 as a configurable option would be a trivial enough change, but it would be a shame to have to deprecate it at some point if it turns out to be not entirely right.

Is the lack of this option a pressing issue for you or blocking production deployment?

cbsmith commented 8 years ago

GA is a different story, because they've got backwards compatibility needs from hell (all those "utm" parameters date back from when it was Urchin!). They still support no javascript style data collection with a simple img tag embedded in your page, and there are enough of those deployed out there that they are in no hurry to remove it anytime soon.

This isn't blocking deployment. It's just making for a reason to prefer a different solution.

I wasn't sure if the divolte implementation had some need to get content back or if it expected a content-type or some such. If it doesn't, I'll just patch it to do the 204 and be done with it.

friso commented 8 years ago

Right, got it.

We're currently working on a major overhaul of both the way configuration is implemented and the internal threading + event passing (on dev/many-to-many). The idea is that we'll support multiple sources and sink location in one collector instance for use cases like multi-tenancy and eventually server side events. Picking this change up right now would probably make it quite painful to merge at later stage. Would you mind if we push it back a bit?

cbsmith commented 8 years ago

Makes sense to me. I was going to implement it as another handler that was available and let the path from the client side dictate the response.

asnare commented 7 years ago

I just ran some tests with a 204 response against all the browsers we test against, and it turns out this basically works fine. The main sharp edge is that this typically triggers the 'onerror' handler instead of 'unload' for images.