snowplow-referer-parser / referer-parser

Library for extracting marketing attribution data from referrer URLs
http://snowplowanalytics.com
358 stars 149 forks source link

Contributors guide for modifying the data file #132

Open kingo55 opened 8 years ago

kingo55 commented 8 years ago

For new additions to the referrer YAML, it would be helpful if there were some guidelines on how to name / group sites.

Based on what I see in the YAML now, I don't even agree with past contributions I've made. E.g.:

  1. Given email is just one of the services Naver provides, Naver Mail should just be Naver... just like how Google is represented
  2. It's also not clear how we handle different sites of the same company, or the same brand in multiple countries.
  3. Sometimes the medium email makes sense because we can see traffic arriving through Cheetah Mail or Responsys servers but we wouldn't call them an email provider.

Thoughts?

alexanderdean commented 8 years ago

Agree, we need an initiative to write a contributors guide for the data file...

alexanderdean commented 8 years ago

More than just naming conventions - also rules like:

Any given referrer URI should only be found in the database once. If the same URI is used for two different mediums, like search and paid, then we should give the traffic the benefit of the doubt and make it search (i.e. don't assume paid).

(https://github.com/snowplow/referer-parser/issues/130#issuecomment-234183125)

kingo55 commented 8 years ago

Also useful:

@alexanderdean - happy to work on something like this too. Can a pull request be made for GH wikis?

christoph-buente commented 7 years ago

We also assembled more ESP domain names and would like to see them mentioned in the referer.yml file. Is there a process by now?

alexanderdean commented 7 years ago

The process is not yet finished, but we are working on it. Please open a PR and we will get a new versions of the referer.yml database published.

The unfinished work is around making the Java client read an external file, and updating the Snowplow enrichment to support that external file.

christoph-buente commented 7 years ago

Well, i don't have a proper referer.yml file but a list of 3600+ ESP domain names. Would that be of any interest to you?

alexanderdean commented 7 years ago

Wow - that does sound interesting!