matomo-org / plugin-GoogleAnalyticsImporter

Google Analytics to Matomo importer
24 stars 14 forks source link

Importing into an existing website in Matomo (not creating a new one) #2

Open mattab opened 5 years ago

mattab commented 5 years ago

Currently each GA data import will create a new website in Matomo. Instead it would be great to let users choose, while preparing to import GA data, whether:

This will be valuable because often people want to migrate their Google Analytics web data after migrating from GA to Matomo. So they're already tracking data in both GA and Matomo. And having the GA historical data in a separate website makes analysis complicated (row evolution does not work, evolution graph, etc.).

Challenges with letting users import data into an existing site is how to decide what to do with:

The goals/custom dimensions may be present in both GA or Matomo, or just in GA. When present in both, we need a way to map the GA goals to Matomo Goals, maybe

Notes: one thought that one could use roll-up reporting to aggregate together the old GA site and the new Matomo site, but that unfortunately wouldn't work because Roll-Up reporting aggregates the data from the logs.

badolgardiola1997 commented 5 years ago

Currently each GA data import will create a new website in Matomo. Instead it would be great to let users choose, while preparing to import GA data, whether:

  • to create a new website to import the GA property
  • to import the data into an existing website ID in Matomo (eg. showing our website selector)

This will be valuable because often people want to migrate their Google Analytics web data after migrating from GA to Matomo. So they're already tracking data in both GA and Matomo. And having the GA historical data in a separate website makes analysis complicated (row evolution does not work, evolution graph, etc.).

Challenges with letting users import data into an existing site is how to decide what to do with:

  • Goals
  • Custom dimensions

The goals/custom dimensions may be present in both GA or Matomo, or just in GA. When present in both, we need a way to map the GA goals to Matomo Goals, maybe

  • letting users choose from a list what to do each with each goal "Create a new goal in Matomo" or "Track this GA goal data into this existing Matomo Goal"
  • or matching them by name. And show users "This goal has the same name in GA and Matomo so the data will be imported in the existing Goal" or "A goal with the same name does not yet exist in Matomo, please rename the goal in GA (or Matomo) so they match and try again. Or proceed and a new Goal will be created in Matomo".

Notes: one thought that one could use roll-up reporting to aggregate together the old GA site and the new Matomo site, but that unfortunately wouldn't work because Roll-Up reporting aggregates the data from the logs.

mattab commented 4 years ago

Note: importing into an existing website will cause problem if the existing site has data already tracked. It's common for people to use both Matomo and GA for a while, and maybe later delete the GA code. So we'd want to make sure to import the data only until the site starts to have data in Matomo (VisitsSummary.get > 0)...

tsteur commented 4 years ago

It be also a problem when periods are invalidated as part of a regular invalidation etc. Eg yearly archive etc.

diosmosis commented 4 years ago

@tsteur That could be problematic, we'd need a way to avoid invalidation for imported data... that should be fixed soon I think.

tsteur commented 4 years ago

@diosmosis if this is related to Matomo for WordPress we could just create a new site there and import it into that site. We could create a site by eg directly calling addSite on the API class etc. The site selector would be still shown so it should all work.

I reckon this feature can create quite a few problems and confusion etc (eg creating a segment shows only partial data etc) might be a lot easier/better to not work on it. Also it might be confusing in terms that Matomo and GA track different numbers and it would maybe end up from this perspective more confusing as well

diosmosis commented 4 years ago

@tsteur The invalidation issue should affect everyone really, if a user tries to invalidate an old archive that was imported, and the archiving runs, the data that was imported will be lost. There should probably need to be a way to prevent it.

Other than that only today will be invalidated and re-archived from logs. The other reports should just re-aggregate day archives ... except for unique visitors, that could be a problem too.

tsteur commented 4 years ago

Maybe once the data has been archived, we simply stop the archiving for that site and always skip it?

diosmosis commented 4 years ago

Right now users can track into an imported site so the imported and new stats are together. Is there a mechanism to stop archives for purged data from being invalidated?

tsteur commented 4 years ago

I see. I would say we maybe shouldn't support tracking into an imported site that be easiest.

tsteur commented 4 years ago

It is always possible to use RollUp Reporting if someone wanted to see data aggregated I suppose.

diosmosis commented 4 years ago

That would also mean closing this issue altogether then. Guess it's not an issue w/ roll-up reporting, though perhaps a bit inconvenient... Might be more inconvenient for wordpress. CC @mattab

We do also have the start/end date that was imported. We could also just make sure invalidation doesn't work for those dates (somehow).

mattab commented 4 years ago

That could be problematic, we'd need a way to avoid invalidation for imported data... that should be fixed soon I think.

Fyi there is already logic in the invalidateReports API which would not invalidate the reports if there is no RAW data for this day. @diosmosis

It is always possible to use RollUp Reporting if someone wanted to see data aggregated I suppose.

unfortunately wouldn't work because Roll-Up reporting aggregates the data from the logs and we don't have logs for GA imported data @tsteur

Let's not work on this issue for now, but keep it opened as we may want to work on this later..

diosmosis commented 4 years ago

Fyi there is already logic in the invalidateReports API which would not invalidate the reports if there is no RAW data for this day.

That's good, though there's still an issue w/ unique visitors and periods involving today, ie, this week, this month, this year. If aggregating unique visitors from logs is enabled, Matomo may end up erasing the imported unique visitors.

mattab commented 4 years ago

Fyi there is already logic in the invalidateReports API which would not invalidate the reports if there is no RAW data for this day.

Actually I was wrong, checked and the logic in removeDatesThatHaveBeenPurged / findOlderDateWithLogs does not actually check for RAW data but checks for the setting / feature "Delete logs older than N days" and uses this to assume the RAW data is available in the last N days. So to make it work with GA data, we'd need some new logic there to not invalidate dates that don't have any RAW Data...

tsteur commented 4 years ago

It wouldn't maybe work though for bigger periods like month and year.

Unfortunately wouldn't work because Roll-Up reporting aggregates the data from the logs and we don't have logs for GA imported data @tsteur

I know but it be easy (2 lines or so) to get the archiver to aggregate the archives even for day if a roll up contains such a site. Would need some extra setting though maybe as sometimes you would just want to ignore that site and aggregate raw data anyway

mackaaij commented 4 years ago

I switched from Google Analytics to Matomo in 2018 and would like to merge the site created by the importer with my existing Matomo site. I'm currently in the process of importing step-by-step due to the Google Analytics API rate limit.

As I didn't run Matomo side by side with Google Analytics, there is very little overlap:

I don't know much about the technical difficulties here but is it somehow possible to merge the two sites?

I read about the Site Migration plugin but that seems to fit another use case. In the forums people talk about the Migration plugin for cases that seem similar to mine, would that work?

diosmosis commented 4 years ago

@mackaaij I did a test using the RollUpReporting plugin (https://plugins.matomo.org/RollUpReporting) with a site with imported data and a site with tracked data and it seemed to work out. It's not a free solution, but it should work to provide a combined view. Note: you'll have to create a new Roll-Up site and include the two other sites, then run archiving for the new Roll-Up.

mackaaij commented 4 years ago

Thanks but for me it's the same single site, not three. And an additional plugin. And additional bills (yearly) to pay. I'd be OK with choosing an exact date to take away issues of overlap, in my case May 22, 2018 is Matomo.

ampaze commented 4 years ago

I have the same problem as @mackaaij.

Isn't the whole point of the Google Analytics Importer to be able to compare old (Google Analytics) and new (Matomo) data for the same website?

Is that possible with data spread over two siteIds?

diosmosis commented 4 years ago

@ampaze the only solution at present is to use RollUpReporting.

mattab commented 3 years ago

@ampaze the only solution at present is to use RollUpReporting.

fyi unfortunately that wouldn't work because Roll-Up reporting aggregates the data from the logs and we don't have logs for GA imported data.

@diosmosis as this issue is causing many people to be confused, and support requests, would you be able to estimate the effort of implementing this feature? so we know if we can schedule the work in the future. Thanks!

diosmosis commented 3 years ago

@mattab the work involved would be:

I'd say 2-4 days if I spent all my time on it.

tsteur commented 3 years ago

btw there might be also some things to look at whether the site already had data for those imported periods etc.

I suppose invalidation/ report deletion is fine and doesn't change anything

diosmosis commented 3 years ago

btw there might be also some things to look at whether the site already had data for those imported periods etc.

in this case I think we'd just import over everything, as is done w/ re-importing data. if they have tracked data, perhaps we'll have to report it somehow or just fail and ask the user to delete the visits if they want to replace it w/ imported data. this could actually be done before importing, just looking for a single visit within the import date range.

anton-lava commented 3 years ago

I'm running into this specific problem. And I can confirm that using the RollUp plugin does not work and just ignores include the imported data.

Would love to see a fix for it 😄

miiimooo commented 2 years ago

@mattab the work involved would be:

  • create new form sections for mapping goals/custom dimensions (which includes a checkbox that would trigger an api request to do GA queries, the UI for mapping goals/custom dimensions)
  • handling the new input and marking an import as into an existing site (handling means sending the mapping information and loading it into existing option data before starting an import)
  • validating the mappings (just in case custom dimension values in GA can't be imported into a matomo custom dimension, etc.)
  • manual testing
  • a refactor to ImportStatus since it's getting very complicated and this will just add more complexity

I'd say 2-4 days if I spent all my time on it.

@diosmosis could we discuss this in a direct message?

atom-box commented 2 years ago

One user contacted us this week: "My organisation has recently started using Matomo and we're busy importing old data" The user offers to help, because he would like to see this feature realized.

AltamashShaikh commented 2 years ago

@atom-box We have this issue in our pipeline, but unfortunately we don't have any ETA on this Once we started working on this we can give you a better ETA

CG-White commented 2 years ago

Hi, we also have this issue and would like to import data into an existing Matomo site/ID. Would it be possible to indicate some sort of planning ? Cheers!

AltamashShaikh commented 2 years ago

@CG-White We are still evaluating and hopefully will have a plan soon and we can answer your question post that.

CG-White commented 2 years ago

@AltamashShaikh Thanks for the reply. Since so many people are moving away from Google, at least in Europe, I think this functionality would be useful to a lot of customers.

MatomoForumNotifications commented 1 year ago

This issue has been mentioned on Matomo forums. There might be relevant details there:

https://forum.matomo.org/t/google-analytics-importer-import-ga-data-to-an-existing-website-in-matomo/50802/2

MatomoForumNotifications commented 1 year ago

This issue has been mentioned on Matomo forums. There might be relevant details there:

https://forum.matomo.org/t/google-analytics-importer-tracking-new-data-while-still-importing-old-data/51158/1

Birkaransingh94 commented 1 year ago

Another customer has raised this request. Comments below:

When doing the import, if you had set a cutoff-date and later decide you want to import more historical data, that doesn't seem to be possible without it setting up a new website in Matomo. @AltamashShaikh

48design commented 1 year ago

This would indeed be very helpful. Since Google Analytics Imports are very time consuming because of the time limits, we stopped the import for our main website and switched over to Matomo completely. Now there is a gap of about a year that is missing in our statistics and it would be really great to fill it up with Google Analytics data of the past...