matomo-org / matomo

Empowering People Ethically with the leading open source alternative to Google Analytics that gives you full control over your data. Matomo lets you easily collect data from websites & apps and visualise this data and extract insights. Privacy is built-in. Liberating Web Analytics. Star us on Github? +1. And we love Pull Requests!
https://matomo.org/
GNU General Public License v3.0
19.73k stars 2.63k forks source link

Provide Postgresql support for Piwik #500

Open anonymous-matomo-user opened 15 years ago

anonymous-matomo-user commented 15 years ago

I want to be able to use a postgres database (instead of mysql).

please see http://piwik.org/faq/how-to-install/#faq_55

Latest update: see the Piwik fork with Postgresql support. Great contribution by Sridhar from the community Keywords: wishlist

robocoder commented 15 years ago

See ticket #425.

mattab commented 15 years ago

Thank you for the suggestion! The piwik team decided to not support Postgresql as it would make the development process much harder for us, for a quite small number of users requesting Postgresql.

Piwik officially supports MySQL only for now.

anonymous-matomo-user commented 15 years ago

Let's go here : http://github.com/klando/pgpiwik/tree/master

If you are interesting, then :

git clone git://github.com/klando/pgpiwik.git

anonymous-matomo-user commented 15 years ago

too bad. I would have loved to see this happen. I would say that people that would use postgres would rival that of mysql.

anonymous-matomo-user commented 15 years ago

Do I interpret right that you expect to have more tickets opened to request postgres? :-) I usually don't even bother to check a mysql-only project, let alone open a bug about it in their project, so your assumption about the size of the postgresql userbase may be extremely skewed.

Apart from letting you know that many people use db other than mysql it is a bit more important to note that even php makes it possible to write db backend independent code, which could support nearly any database backends, including, but not limited to psql and sqlite. I do not believe it would visibly make development harder to use standard SQL and standard DB API, since at a first blick your code is pretty much mysql independent (only a few places may require tuning) and you seem to use backend independent code (at least you include it in the distribution). So it may only required a few changes here and there. Pity I do not like php at all, so I cannot offer to patch it up, and you'd rejct anyway from an outsider. ;-)

So, please consider to use database backend independent code and API. It would be nice and The Right Thing(r)(tm).

robocoder commented 15 years ago

Re-closing. This is a resource issue; not a techical one.

We need people to step up to:

anonymous-matomo-user commented 15 years ago

Replying to vipsoft:

Re-closing. This is a resource issue; not a techical one.

We need people to step up to:

  • implement the changes to core
  • provide ongoing support to end users
  • test releases for regressions
  • maintain the backend as development continues
  • I have implement change to core,
  • I provide ongoing support via mail and via irc freenode #piwik (well, it is irc, ask, wait, wait,wait, answer)
  • I didn't have test yet, but I saw that your are in the pocess of Zendify so, it will be easy.
  • I merge with the svn as soon as I am aware of svnchanges (despite the fact that the ML for svncommits is down, so I have to check from time to time)
mattab commented 15 years ago

A good ressource to read by wordpress team: http://codex.wordpress.org/Using_Alternative_Databases

anonymous-matomo-user commented 15 years ago

Well, since Piwik uses ZF for database handling, it wouldn't be that hard, "just" editing the SQL statements to be compatible with other DBMS. I see further problems in plugin development, there will be coder who make their plugins only MySQL compatible. I don't know how many people here would like to see their Piwik with postgres, but I think it can be done if we work together.

robocoder commented 15 years ago

Blockers:

anonymous-matomo-user commented 13 years ago

I'm keeping MySQL on my VPS just because Piwik requires it. I decided to start support for PostgreSQL in Piwik, the plugin approach seems the right first try. When it stabilizes in my production VPS, I'll report here.

anonymous-matomo-user commented 12 years ago

I strongly ask for Postgres support as MySQL has various bugs and critical problems (like security, speed, scaling).

Supporting any database is very easy. Everybody knows that plain SQL queries within sourcecode are a big risks, because some people know how to write good queries, others don't. Injection and speed issues are unavoidable.

There must be something like http://search.cpan.org/~abraxxa/DBIx-Class-0.08195/lib/DBIx/Class.pm for PHP. Switching shouldn't take that much time, if you still use plain queries in sourcecode, and will greatly improve security and portability.

PS: MySQL is already on it's way to become payware and I doubt that many users would pay for it just to use Piwik :-)

anonymous-matomo-user commented 12 years ago

I'm interested to use Piwik with Postgresql too !

Right now, the only reason why i'm still using Google Analytics is because Piwik don't support pgsql...

Is there any chance to use piwik with alternatives databases soon ?

anonymous-matomo-user commented 12 years ago

Replying to grin:

I usually don't even bother to check a mysql-only project, let alone open a bug about it in their project, so your assumption about the size of the postgresql userbase may be extremely skewed.

+1 for Postgres support (I registered just to reduce the skewing effect)

anonymous-matomo-user commented 11 years ago

Another +1 for PostgreSQL support. Piwik is a really nice piece of software, but I definitely won't install another DBMS so I can use it.

anonymous-matomo-user commented 11 years ago

Another +1 for a non mysql version of Piwik here! PostgreSQL would be great! Also other great software like Drupal is supporting postgresql nowadays.

mattab commented 11 years ago

like we said we are keen to do it if someone submits a patch. There is clearly a lot of interest out there but not yet someone who has the skills and time to invest in such a huge coding spree!

anonymous-matomo-user commented 11 years ago

Reopening, as it appears that the devs are now welcoming patches for this request.

Also +1'ing, because I refuse to deal with MySQL and Oracle anymore. I could switch to Maria DB, but Postgres seems to have all the momentum.

Would love this feature. Without it, I'm going to go looking for another solution.

reedstrm commented 11 years ago

+1 here as well. I've looked at Piwik several times, and mysql has been the deal breaker each time.

anonymous-matomo-user commented 11 years ago

+1 here as well. I'm using Piwik for a volunteer organization, and we are sensitive to the privacy issues posed by for instance Google analytics.

We're using Piwik with mysql locally right now, but all other resources are running against a postgres cluster, with all the bells and whistles, most importantly proper data backup and protection routines. It would be great to get rid of the lone mysql service running locally.

anonymous-matomo-user commented 10 years ago

+1. We have a small VPS that we use for hosting our site (Apache+Postgresql), and it's working quite nicely, but having to add MySQL just for Piwik uses up almost all memory, even if we configure MySQL to use as little as possible.

anonymous-matomo-user commented 10 years ago

+1

MySQL is a big joke. (It never makes me laugh though.)

Unfortunately MariaDB has to have all the MySQL strangeness in order to work as a drop-in replacement. This, however, makes it just as irrelevant as an alternative.

I want to get rid of that installation for the same reasons already mentioned by others:

Jon-IB commented 10 years ago

I'm interested in taking on the work of adding support to PostgreSQL.

The actual amount of SQL that needs to be changed doesn't seem too dramatic. I'm more concerned with the following:

1) There isn't much of a db abstraction layer, as far as I can tell. 2) Plugin developers would need to support both MySQL and PostgreSQL. Or alternatively, each plugin would need meta-data to indicate which database(s) it supports. 3) As noted by others, testing and CI would need to cover both databases.

Most importantly (for me), does anyone on the core team have the time to map out an approach for a database abstraction layer? Once that's done, I think we can handle the pgsql-specific changes.

Also, any thoughts on the other concerns above?

EDIT: Looks like there's some active discussion on this in the mailing list. I'll follow up there.

mattab commented 10 years ago

Thanks for showing your interest in this topic! While we are not actively engaged in this work (yet) many team members are interested in running other DBs with Piwik. the issue is, that it's super complex.

Most importantly (for me), does anyone on the core team have the time to map out an approach for a database abstraction layer? Once that's done, I think we can handle the pgsql-specific changes.

We could map out an approach, but honestly, unless we do the full work, do you think anyone would do it?

If anyone is really interested, yes we could work for a few days / weeks on the basics of DB abstraction layer. But converting the whole codebase to use such layer, is rather huge and complicated task.

If you know any software engineer/developer with 3-4 weeks available and a big brain willing to explode, and a big interest in this work, please let us know here!

Jon-IB commented 10 years ago

(Apologies, I think I misspoke -- it looks like Piwik has a db abstraction layer, in the form of the Db class and the Zend_Db_Adapter. What's missing is a data-access layer, since all of the SQL code is intermingled with the business logic.)

Matt, I can spare a developer for 3-4 weeks, but I doubt we'll have time to do much more than that right now. So I'm hesitant to dive in, given that we might realize after 2 or 3 weeks that we'd need to spend a lot more time on this. If that happened, I would need to find an alternative solution quickly.

One creative solution is to do this incrementally, by updating the code file-by-file to use a new data access layer. This could only work if some of the code could be updated to use a new data access layer, but the rest of the code was left as-is. I'm not sure how exactly to do this, but I'm guessing the two approaches could co-exist, since most or all queries seem to go through the Db class.

This would allow the core team to implement and test an approach that you're happy with, since you'd only have to implement a data-access tier for a few files. That would prove that it works, and your code would give us a template to follow. You wouldn't even need a PostgreSQL database to test those changes; just set up two different MySQL logins with access to the same schema.

Once you're done with the proof of concept, we would come in and move the SQL into a new data-access tier for as much of the code as possible, following your template. Our initial focus would be all of the SQL that is specific to MySQL, since the app would be usable for both PostgreSQL and MySQL if we get through all of that code and leave the rest of the files as-is. But we'd try to get through all of the SQL while we had focus.

I can offer at least 3 person-weeks of work, and hopefully a bit more, which might be enough to get through all of the code. Other developers could help with this as well, if any are available. And any remaining files could be handled by the community or by us as time permits.

I'd be more comfortable diving in if we followed this approach, because we'd be more likely to have a production-quality solution within 3-4 weeks that we could deploy.

The big downside is that the code might end up in a Frankenstein-like state for some period of time, with some code calling to a data-access tier, and other code with inline SQL. You might not consider that to be "production-quality".

Having said that, a data-access tier might help structure the code more cleanly. If so, then the incremental approach would let you migrate the code towards that structure without having to take on all of the work within a single release.

Jon-IB commented 10 years ago

Ah, I see a few classes under the Piwik\DataAccess namespace. Is this the start of a full data-access tier? Or is this a solution for a much smaller problem?

anonymous-matomo-user commented 10 years ago

Please see https://github.com/sri-soham/piwik . This is an (almost) working version of Piwik with PostgreSQL suppport.

Jon-IB commented 10 years ago

Thanks, @low. I saw a link to that repo in another ticket, so our developers have been looking closely at it for the past few days. I myself looked too quickly; I saw a change to a file that commingled a MySQL and PostgreSQL error code (core/DataAccess/ArchiveTableCreator.php maybe?), so I quickly assumed that there was no db abstraction layer in this repo. I assumed incorrectly that @sri-soham was working initially on a proof of concept, without changing the code structure.

One of my developers mentioned that this repo uses DAOs to encapsulate both data access and database abstraction, so I looked more closely. The DAO implementation looks pretty darn solid to me. The pgsql classes extending the MySQL classes to take advantage of common logic. It would be straightforward to add support for additional DBMSes, if that was desirable.

Matt et al, have you been following this work? What do you think of this implementation? If the remaining issues are resolved, is it reasonable to assume that this code or something like it will make it into the upstream repo in the foreseeable future?

mattab commented 10 years ago

Increasing priority to reflect the very high interest of the community in supporting other databases! Please keep commenting.

I am a bit swamped at the moment, but I'll do my best to review the code and give a feedback here in the next week or so. Thanks for your patience and trying to make this happen!

mattab commented 10 years ago

hey Jon! I have posted a message to Sridhar on the piwik-hackers mailing list: http://lists.piwik.org/pipermail/piwik-hackers/2014-February/001452.html

I updated the description of this ticket with: see the Piwik fork with Postgresql support. Great contribution by Sridhar from the community

@Sridhar if you read this, maybe you could update the port to sync with master? I am looking at this diff in particular: https://github.com/sri-soham/piwik/compare/piwik:master...master

If you confirm this is the right place to look, I will discuss the fork with the Piwik team and will come back to you here about our feedback!

Jon-IB commented 10 years ago

Matt, sounds great! Please keep us posted on the discussion with the rest of the team.

We'll reach out to Sridhar on the mailing list.

mattab commented 10 years ago

Have you tried the Postgresql fork and how does it work for you?

@Jon_A @reedstrm @bfg1981 @joril @bugout

anonymous-matomo-user commented 10 years ago

@matt: I have the 2014-01-25 revision in production, it's not perfect but it works. :) I should try updating to the latest revision when I find some time

Jon-IB commented 10 years ago

We're running the latest from sri-soham's master. It's in production on a few test sites.

We made a few minor bug-fixes, so we need to submit a pull request to sri-soham.

We're seeing occasional HTTP 500 error responses on some pages of the admin UI and on some reporting API calls, especially on requests for same-day data. But we think this might be because we're running the code on PHP v5.5. We're going to downgrade PHP and see if that resolves the issue.

Apart from the occasional HTTP 500 errors, everything seems to be working fine. I don't see any HTTP 500 errors on the tracking calls in our server logs.

pirate commented 9 years ago

+1 for postgresql support :)

m13rr0r commented 9 years ago

+1

bfg1981 commented 9 years ago

@mattab Sorry for the late reply, I didn't notice your reply until the +1 by m13rr03. No, I haven't tried the PostgreSQL fork.

Not really a priority right now, since MySQL is already in production on the servers that need Piwik. However it would be nice to get rid of the scattered MySQL installations, and rather rely on the centrally managed PostgreSQL farm. I've put it down as an issue in our backlog, If you're still interested in feedback, I can reasonably expect to get around to deploying a test installation it within 2 months.

Jon-IB commented 9 years ago

We're now running the pgsql fork on about 5000 sites, though all of these sites are fairly low-traffic.

We found a few bugs, though I've only managed to submit one to the fork so far. When things calm down I'll submit the rest.

The code is still running a bit slowly. For example, it takes roughly 8 seconds to run daily, weekly, monthly, and yearly archives for a single site with just a few daily page views. Part of this is because we're putting a decent amount of API load on our servers now. And part is because we partitioned a couple tables so we could delete old data much more easily. Alas, that causes some queries to scan all of the partitions, which slows things down. I could limit those queries to just one or two partitions, but I'm trying to avoid core modifications that aren't legitimate bug fixes.

So it's not as fast as I'd like, but I think that just requires a little attention, which I don't have right now. Otherwise, it is very solid. I feel much more comfortable about expanding this implementation than if it were on MySQL.

I'm still a big fan of the way Sridhar separated the business logic and data access logic. There's still a little work to do there for plugins, but that doesn't seem difficult.

craig commented 9 years ago

Awesome! I'd love to see postgres support! :)

Hadlaskard commented 9 years ago

Hi!

I tried this fork yesterday to get work with PostgreSQL, our favorite DB, but didn't get it work. Does anyone know how to solve the following error I've got after finishing installation process of piwik:

Apache-Log:

PHP Fatal error: Class 'Piwik\Db\DAO\Pgsql\Archive' not found in /srv/www/piwik/core/Db/Factory.php on line 98, referer: http://webgis/piwik/index.php?module=CoreHome&action=index&idSite=1&period=day&date=today

...and following is displayed at every analyzing frame of the piwik-gui:

lockNameForNextIdarchive($table); $Generic = Factory::getGeneric($this->db); if ($Generic->getDbLock($dbLockName) === false) { throw new \Exception('loadNextIdarchive: Cannot get lock on table '. $table); } $this->deleteByIdarchiveName($table, $idArchive, $name1, $name2); ...endless.

I tried some debugging... with no success, it's hard to find out. (SLES 11, PHP Version 5.3.17)

jlmadurga commented 9 years ago

Hi all,

I have installed the postgres fork successfully but I am getting problems loading css. I have just patched with https://github.com/sabl0r/piwik/commit/f7f5296b19316427c34551138b98b639e505546e but I am getting same issue. Any one now how to fix it? (I am deploying in heroku which has php 5.6 I think)

augustf commented 9 years ago

@Runaway82 I used the https://github.com/sri-soham/piwik fork and had the same issue you did. Enabling PHP short tags by default in my php.ini sorted it though.

While I have the fork working with an appropriately dated LDAP plugin version, I've been unable to successfully merge with the latest Piwik -- though that seems mostly to do with architecture changes than the Postgres changes.

@mattab While this Postgres fork doesn't represent the sort of robust abstraction being discussed earlier in this thread, it's a tremendous step forward. I would say that it is at least worth the effort to add it as a branch in the main repository and make a modest effort to keep it in step with master until further work such as @Jon-IB described can be considered.

antwan commented 9 years ago

+1 to confirm the interest for this feature.

lenovouser commented 9 years ago

:+1:

augustf commented 9 years ago

Reading through the tickets mentioned in this thread -- #425, #1046, #1368, #2593, #3418, and #5124 -- it seems like this ticket is really only about Postgres "support" to the extent that it's about having no bizarre MySQL-proprietary queries scattered around. Coming from a Rails background, even the memory of not having totally abstracted queries gives me nightmares.

The ticket really ought to be called "review code for database-layer abstraction," with the main objective being to improve Piwik's code quality, readability (having queries abstracted nicely so it's easier for new folks to contribute), reliability, and ease of maintenance (say when schema changes happen). As a side effect, Postgres (and any other relational DB support) will be either an obvious feature or something trivially added by interested parties.

mattab commented 9 years ago

@augustf Yes, you make a good point here. To have a Postgresql support in Piwik, we would need first to refactor heaps of code. Therefore it would be realistic to create a separate issue for this project "review code for database-layer abstraction". If you are comfortable, feel free to create the issue and even get the ball rolling :+1:

augustf commented 9 years ago

@mattab I can do that. I'm not at all familiar with the code base of Piwik, but I'm assuming that the way the current Postgres fork is working is not readily maintainable going forward?

Either way, though, I suspect you're right; the solution will be to refactor non-standard queries.

mattab commented 9 years ago

I'm not at all familiar with the code base of Piwik, but I'm assuming that the way the current Postgres fork is working is not readily maintainable going forward?

I think both solutions are possible, but clearly maintaining the fork is quite a difficult and tedious thing to do, as we often refactor things around and move files, etc. clearly it's not fun for the maintainer to maintain the fork, nor is it secure for users of the fork since they may miss the latest updates to Piwik including security fixes

so realistically i think the very best way to move forward would be to refactor the code and allow to have two options, one for mysql and one for postgresql. I would roughly estimate this work to take several week of full time development and unfortunately the core team cannot work on this yet. But maybe if the community gets it started then we could join the effort and help :+1:

augustf commented 9 years ago

Ahh-I haven't looked at the diff for the fork. I suppose a library that took in MySQL code and spit out nice generic queries was too much to hope for. As I mention in #7347, I think the first step is finding out the extent of the proprietary queries and going from there.

sri-soham commented 9 years ago

This https://github.com/sri-soham/piwik supports both PostgreSQL and MySQL. It trails the main piwik by an year though.