peaksandpies / universal-analytics

A node module for Google's Universal Analytics and Measurement Protocol
962 stars 146 forks source link

Google Analytics assumes it is a new user at each request #124

Open Vadorequest opened 5 years ago

Vadorequest commented 5 years ago

GA count each request sent from the backend as a new User: image

I'm building a chatbot, and each message sent by the user counts as a new user. I really don't get why and everything should be working fine because I set the uid (user id) and cid (client id) using a variable corresponding to the device id.

Here is some source code:

const visitor = ua(this.googleAnalyticsVisitor.tid, this.deviceId, {
        uid: this.deviceId,
      });

      visitor.event(
        category,
        action,
        label,
        value,
        { p: path }, // See "page path" at https://github.com/peaksandpies/universal-analytics/#event-tracking
        (err) => {
          if (err) {
            this.logger.error(err, 'sendGoogleAnalyticsStatistics:error');
            Raven.captureMessage(err, { level: 'error' });
          }
        },
      );

      this.logger.debug(JSON.stringify(this.googleAnalyticsVisitor, null, 2), 'sendGoogleAnalyticsStatistics:googleAnalyticsVisitor')
      this.logger.info(JSON.stringify(visitor, null, 2), 'sendGoogleAnalyticsStatistics:visitor');
      this.logger.info(JSON.stringify({
        category,
        action,
        label,
        value,
        path,
      }, null, 2), 'sendGoogleAnalyticsStatistics:event');

Here are the logs

server 2019-02-12T19:59:55.310Z [StudentSolutionsChatbotAI] debug: [sendGoogleAnalyticsStatistics:googleAnalyticsVisitor] {
server   "_queue": [],
server   "options": {
server     "cookieName": "_ga"
server   },
server   "_context": {},
server   "_persistentParams": {},
server   "tid": "UA-89785688-3",
server   "cid": "64ae7bcc-ff52-46d5-bfa7-5b6ffc9b264d"
server }
server 2019-02-12T19:59:55.311Z [StudentSolutionsChatbotAI] info: [sendGoogleAnalyticsStatistics:visitor] {
server   "_queue": [],
server   "options": {
server     "uid": "311adf00-11dd-11e9-8b1a-fbc5613ee3c7"
server   },
server   "_context": {},
server   "_persistentParams": {
server     "uid": "311adf00-11dd-11e9-8b1a-fbc5613ee3c7"
server   },
server   "tid": "UA-89785688-3",
server   "cid": "64ae7bcc-ff52-46d5-bfa7-5b6ffc9b264d",
server   "uid": "311adf00-11dd-11e9-8b1a-fbc5613ee3c7"
server }

---

server 2019-02-12T19:59:49.586Z [StudentSolutionsChatbotAI] debug: [sendGoogleAnalyticsStatistics:googleAnalyticsVisitor] {
server   "_queue": [],
server   "options": {
server     "cookieName": "_ga"
server   },
server   "_context": {},
server   "_persistentParams": {},
server   "tid": "UA-89785688-3",
server   "cid": "7de72c3c-cea6-4bf9-8e9f-90976850deed"
server }
server 2019-02-12T19:59:49.587Z [StudentSolutionsChatbotAI] info: [sendGoogleAnalyticsStatistics:visitor] {
server   "_queue": [],
server   "options": {
server     "uid": "311adf00-11dd-11e9-8b1a-fbc5613ee3c7"
server   },
server   "_context": {},
server   "_persistentParams": {
server     "uid": "311adf00-11dd-11e9-8b1a-fbc5613ee3c7"
server   },
server   "tid": "UA-89785688-3",
server   "cid": "7de72c3c-cea6-4bf9-8e9f-90976850deed",
server   "uid": "311adf00-11dd-11e9-8b1a-fbc5613ee3c7"
server }

I don't get why it doesn't work properly.

burtonator commented 5 years ago

You have to save the cid and re-use it.

This REALLY screwed me over as I assumed it would be automatic and I had mmonths of tracking data that were useless!

Vadorequest commented 5 years ago

@burtonator yeah, a new one is generated at every request, but even when I provided a front-end (cached) uuid it would consider it as a new user when using this uuid from the server.

My current code is this:

const visitor = ua(this.googleAnalyticsVisitor.tid, this.deviceId, {
        uid: this.deviceId,
      });

But still, doesn't work as expected. If you managed to have it working I'm very much interested, but beware and make sure you check it twice because I've tried for hours and didn't find a proper way yet.

burtonator commented 5 years ago

I"m setting cid... not uid . Try that? I know that uid is supposed to work though. maybe try with cid as deviceId?

Vadorequest commented 5 years ago

I tried it all (for all I know), if you have a working code I'd interested to take a peak!

burtonator commented 5 years ago

Take a look at this:

https://github.com/burtonator/polar-bookshelf/blob/master/web/js/ga/RendererAnalytics.ts

burtonator commented 5 years ago

One thing I noticed is that the User Agent isn't actually tracked by default so I added that as well.

This is from the browser and not node. In Electron it's called the 'renderer' context.

Vadorequest commented 5 years ago

I tried following your way, went from this:

const category = this.getGoogleAnalyticsCategory();
      const action = this.getGoogleAnalyticsAction();
      const label = this.getGoogleAnalyticsLabel();
      const value = this.getGoogleAnalyticsValue();
      const path = this.getGoogleAnalyticsPath();
      const visitor = ua(this.googleAnalyticsVisitor.tid, this.deviceId, {
        uid: this.deviceId,
      });

      visitor.event(
        category,
        action,
        label,
        value,
        { p: path }, // See "page path" at https://github.com/peaksandpies/universal-analytics/#event-tracking
        (err) => {
          if (err) {
            this.logger.error(err, 'sendGoogleAnalyticsStatistics:error');
            Raven.captureMessage(err, { level: 'error' });
          }
        },
      );

to this:

const category = this.getGoogleAnalyticsCategory();
      const action = this.getGoogleAnalyticsAction();
      const label = this.getGoogleAnalyticsLabel();
      const value = this.getGoogleAnalyticsValue();
      const path = this.getGoogleAnalyticsPath();
      const visitor = ua(this.googleAnalyticsVisitor.tid, {
        cid: this.deviceId,
        uid: this.deviceId,
        headers: {},
      }).debug(process.env.NODE_ENV !== 'production');

      const eventParams = {
        eventCategory: category,
        eventAction: action,
        eventLabel: label,
        eventValue: value,
        documentPath: path,
        // userAgentOverride: userAgent,
        applicationVersion: `${process.env.GIT_COMMIT_VERSION || ''}`
      };

      visitor.event(eventParams).send((err) => {
        if (err) {
          this.logger.error(err, 'sendGoogleAnalyticsStatistics:error');
          Raven.captureMessage(err, { level: 'error' });
        }
      });

The first version creates a new user at every event hit, but the second doesn't work at all. No event are received from GA, nothing.

Vadorequest commented 5 years ago

Finally I figured it out, the lib really doesn't help to pinpoint issues

    const visitor = ua(this.googleAnalyticsVisitor.tid, this.deviceId, {
        uid: this.deviceId,
        strictCidFormat: false, // Using non-strict for compatibility with devices using UUID v1 (used at the beginning, can be safely removed after May 2019)
      }).debug(process.env.NODE_ENV !== 'production');

      visitor.event(
        category,
        action,
        label,
        value,
        { p: path }, // See "page path" at https://github.com/peaksandpies/universal-analytics/#event-tracking
        (err) => {
          if (err) {
            this.logger.error(err, 'sendGoogleAnalyticsStatistics:error');
            Raven.captureMessage(err, { level: 'error' });
          } else {
            this.logger.info(`Response received from GA`, 'sendGoogleAnalyticsStatistics:success');
            this.logger.debug(err);
          }
        },
      );

My errors were:

burtonator commented 5 years ago

Glad you found out about the UUID v4 issue.. I was actually not even going to use UUID so maybe I got lucky,.

Vadorequest commented 5 years ago

@burtonator There is one last thing I haven't figured out yet, and you likely have the same issue. How do you do so GA doesn't duplicate the user between the client and the server? I generated a UUID on the frontend and provided it to the backend, but this UUID doesn't match GA internal ID for the user. So I end up with 2 users (1 client, 1 server) for each real user. Did you find a workaround to this?

burtonator commented 5 years ago

@Vadorequest I didn't resolve this. I just don't use that feature. I'm sure I will encounter this at some point.

I also realized that my code doesn't actually appear to work. I think I'm seeing the problem you're seeing where the stat data is being discarded. I made the changes I discussed above and basically nothing is making it into the GA console even though the agent is saying it's sending the stats.

So I think it's silently discarding them (which is pretty evil) ...

I don't necessarily think this is universal-analytics' fault but I'm not certain yet.

I enable debug and I can see it's sending the values but they're not turning up in the GA console.

This is maddening as I've been aggressively trying to analyze my metrics only for them to become completely invalidated.

I think the only real solution moving forward is to use two analytics platforms. This way if one breaks I can compare it to the other.

Vadorequest commented 5 years ago

Well, I have been very precautious about it and did check that the data were saved properly, and it currently does. But it doesn't when the UUID has the wrong format, for instance, and in such case it does fail silently.

It took me a while to figure everything out and debug and such, and I wish I had enabled debug much sooner to be honest!

slidenerd commented 5 years ago

I have the same issue, I am seeing double events from the client and the server @Vadorequest I dont want to track on the server side if the client is doing it, have you managed to figure out how to conditionally track

Vadorequest commented 5 years ago

No, I haven't. My analytics therefore sucks, and I want to stop using GA and looking for a better alternative that is more flexible/reliable.

IMHO, GA is a very old tool that has tons of features and is way too complex and way too little flexible for proper use cases, not only the JS API sucks, but you can't even manipulate your data as easily as it should be. It's too complicated for its own good, probably because it's decades-old and loads tons of useless stuff.

slidenerd commented 5 years ago

And not to mention, despite all this you got regex parsing referrer spam bots everywhere in your dashboard which has no automated filtering mechanism

kirosc commented 4 years ago

cid is meant to track a user's client device, which store as a cookie; uid is meant to track a unique user.

Even you provide a uid to UA, it'll still generate a UUID and attach to it.

If you want to set non-UUID as cid, you can do this

const visitor = ua(TID, userid, {strictCidFormat: false});

Then, the userid becomes the cid

console.log(visitor);
---
{                                     
  _queue: [],                         
  options: { strictCidFormat: false },
  _context: {},                       
  _persistentParams: {},              
  tid: 'UA-14523290-4',              
  cid: 224581513                      
}                                     

But in the case of a chatbot, cid is not required. I think UA should add an option that using uid instead of cid, as in the GA docs, either cid or uid is required.

https://developers.google.com/analytics/devguides/collection/protocol/v1/parameters#user

leohxj commented 3 years ago

if you connect uid (set('uid')), maybe analytics returning user count from the user_id report.