airbytehq / airbyte

The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
https://airbyte.com
Other
16.18k stars 4.14k forks source link

New Source: Matomo Analytics #4046

Closed jwest75674 closed 11 months ago

jwest75674 commented 3 years ago

Tell us about the new connector you’d like to have

Our (Canadian) agency leverages Matomo for clients who are restricted from storing data out of country, or clients who require their data accessible on prem.

Describe the context around this new connector

The use case is as diverse as the use case for Google Analytics and it's related connector.

Describe the alternative you are considering or using

What are you considering doing if you don’t have this integration through Airbyte?

We'll be forced to write SQL queries against on-premise hosted installations, and/or working through data exports. Neither super fun, nor friendly to our junior staff.

sherifnada commented 3 years ago

@jwest75674 thanks for the request! which of the endpoints here would you be interested in using?

jwest75674 commented 3 years ago

Absolutely, Happy to help!

Matomo 4.X, Reporting API, has a very similar role to play as the Google Analytics API of the same name. From the perspective of Airbyte, it is a data source, and is what I have in mind for my own use case.

Next in line would be one of the three Tracking APIs, likely the HTTP Tracking API. I consider this to be lower priority, however the tracking API could allow Matomo to be used as a data destination. This would be a benefit to the organizations using Matomo for the reasons mentioned earlier, or otherwise to organizations who use Matomo as a data lake/warehouse.

cheyura commented 3 years ago

Webhook-based? (no/partially/yes) No

Available authentication modes (API key/Oauth/other) The token_auth acts as your password and is used to authenticate in API requests.

Security considerations The token_auth is secret and should be handled very carefully: do not share it with anyone. Each Matomo user has a different token_auth.

Matomo 4 and newer To generate a token_auth follow these steps:

Log in to Matomo Go to the Matomo Admin through the top menu Click on Personal -> Security In the bottom of the page click on “Create new token” Confirm your account password Enter the purpose for this plugin as a description Click on “Create new token”

Sandbox account? No

How to populate the account with data? API, UI

Available streams for sync Tracking API https://developer.matomo.org/api-reference/tracking-api Reporting API https://developer.matomo.org/api-reference/reporting-api

Integration supports incremental sync? No

Other information/blockers?

cheyura commented 3 years ago

@sherifnada This connector is a smth like Google Analytics for websites. The most easy way is to gather analytics frpm Airbyte.io. Java Script tracking code(https://airbyte.matomo.cloud/index.php?module=CoreAdminHome&action=trackingCodeGenerator&idSite=1&period=day&date=2021-07-01&showtitle=1&random=2941) should be inserted into web-pages of Airbyte.io. And then we can call Tracking API on visits, clicks and so on.

cheyura commented 3 years ago

We have access though API to demo site, API token is generated for integration-test@airbyte.io but it is not needed for demo (public) site

florent-martineau commented 2 years ago

Any update on this connector?

Dynnammo commented 2 years ago

@florent-martineau Please add a :+1: on the first post, the more there is the more probable it'll be handle by Airbyte or volunteer contributors :wink: .

mickaelandrieu commented 2 years ago

Hi,

this connector is labeled as "Done" in the roadmap, but I don't find the connector in the list of sources or destinations.

Do we have any ETA ?

Regards

sherifnada commented 2 years ago

cc @YowanR as the person managing connector dev backlog

YowanR commented 2 years ago

@mickaelandrieu Could you point me to where you saw this connector marked as done, please? AFAIK, this connector has not been tackled yet so it's probably a typo from our side :)

norbertorok92 commented 1 year ago

Hi, @YowanR do you have a rough estimation on the timeline of when you guys planning to have a connector for matomo?

mickaelandrieu commented 1 year ago

They don't have an ETA as no one is working on it AFAIK :(

vchallier commented 1 year ago

Looking forward to this connector !

firehist commented 1 year ago

Hey lovely team, is there any news regarding the ETA of this fabulous connector.

Thanks a lot!! 🚀

mattab commented 1 year ago

Great to see Matomo is the 3rd most popular new connector request!

Question to all: could you maybe give some example use cases of what you'd like to achieve with the Airbyte Matomo source?

As the founder of Matomo I just wanted to provide a basic information, that we have a few different APIs but maybe these would be the useful ones:

  1. Matomo provides a simple way to fetch the reports called the Metadata Reporting API (see api reference). This is relevant when people using Airbyte want to fetch their analytics reports from Matomo. It's basically a 1-2 simple APIs that can be used to get any of Matomo 100+ reports.

  2. there is also the Live API lets you request the Matomo RAW data (see faq with details). This is relevant if Airbyte users want to fetch the web analytics RAW data (the detailed list of visits, actions, with all the metadata columns)

  3. if Airbyte can connect to Databases including Matomo database, then Airbyte users might be benefit from our Database Schema documentation here: https://developer.matomo.org/guides/database-schema

Happy to help with any question and hope that Matomo can be added as an Airbyte connector :+1:

mickaelandrieu commented 1 year ago

Hi @mattab,

I've decided to consider Matomo just like any other PostgreSQL database (so option 3) and rely on the documented queries and relevant tables as a start.

For now and if buying a module is not possible, I'd be against an REST API connector because the available data from it is too limited for my needs.

We should document the option 3 in both Matomo and Airbyte docs: I can give a hand :)

Regards

alxsbn commented 1 year ago

Any news about the connector?

atharva47dev commented 1 year ago

Please provide the YAML file if not the connector.

mickaelandrieu commented 1 year ago

Hi, I don't use any specific connector, I self host Matomo and ingest the data just like yet another MySQL database.

Soon, I will release a complete tutorial on how to self host Matomo and configure it for performance and scalability. But I can't share any dbt models as it's not open source stuff :/

atharva47dev commented 1 year ago

Thanks Mickael, It would be super helpful. We also self hosted Matomo but since I am not aware of its schema, I am facing difficulty in terms of understanding how to merge the various tables to make the report.

mickaelandrieu commented 1 year ago

Hi @atharva47dev,

We started from this page into the official documentation if you haven't noticed it, yet ? https://matomo.org/faq/how-to/how-do-i-write-sql-queries-to-select-visitors-list-of-pageviews-searches-events-in-the-matomo-database/

I'll ask my boss if someday we will be allowed to open source some of our models, but tbh we have a little bit customised it to manage our custom dimensions so I'm not sure this will be useful "as it" :/