Open partoneoftwo opened 8 years ago
Hi @thomasfrivold, Thanks for the suggestion. This would be a fairly large change to require people to use PostgreSQL. MySQL is still very popular and is really simple for people to install and use as many are already familiar with it due to it's heavy usage in open source projects such as Wordpress and others. I also like the fact BookStack can currently be used with either MySQL or MariaDB. I would be happy to look into switching it to PostgreSQL but I feel there would really need to be a reason to switch things up.
Is there anything that you can think of that PostgreSQL could specifically bring to BookStack?
In Laravel documentation page there is a statement that it supports:
Would it be possible to choose between those in setup?
@tpetrauskas Yeah, Thanks to Laravel, supporting PostgreSQL would be really simple for most of the database work. We would need add code to provide separate support for setting up full-text searching and it would be an extra system to test on. The current search indexes & Logic is focused on MySQL and I believe PostgreSQL uses a different syntax for full-text queries.
Full text searches are really awesomely implemented in Postgres.
Check out these resources:
I'm also interested in Postgres support.
From the looks of it, either the fulltext search for mysql is not good enough, or it's not used right.
In postgres fulltext search strips words in a language specific way, so search uses the stem of a word. This ensures that searching for "s" will also get results that contain the german "ß". It also strips common filler words, like "a", "and", ...
I'm not sure if mysql can be configured to do that, but postgres does that more or less by default.
As for ease of deployment: Postgres comes with sane defaults, while mysql needs configuring to get there...
Is this still on the table as an upcoming feature?
Since the search relies on mysql this isn't that easy.
The best solution (in my opinion) would be, to switch the search to scout with a basic provider like tntsearch shipping with bookstack. This would decouple the search from the used database and open up for using any database server you want.
Additionally, with the correct configuration, it would be possible to use different search providers like elasticsearch or algolia and so on.
So I guess it is still on the list but rather waiting for a generous PR, since it is not crucial. (see https://github.com/BookStackApp/BookStack/issues/674#issuecomment-361419997 and the following comment by @ssddanbrown)
Thought I'd weigh in on the above, Especially since my last comment on this is fairly old now.
As of v0.16 I re-wrote the search system to not rely on MySQL full text indexes and now a custom indexing system is used which basically just splits the words and scores them in their own indexed table (search_terms
).
Now that we don't rely on any MySQL specific features it does make it easier to connect to other databases. There are still a few areas which use Raw DB queries simply because the complexity is hard to convert to fluent syntax. I had a really quick go at attempting SQLite support recently but came across some issues with the migrations.
Honestly, I don't think adding Postgres/Sqllite support would be too difficult, Maybe just a day of sitting down and working things out. I'm more worried about the added support complexity going forward.
Maybe if we add Postgres/Sqllite support but make it clear that using those may be more unstable/less supported.
Just to answer your question @evitalis, Yes, Still on the table but not urgent as not too highly request and not a massive value add upon what already exists.
I will swap this from 'Open to discussion' to 'Enhancement' now that we don't rely on MySQL full-text indexes.
I'd be willing to put in $$$ if there is a bounty on the addition of SQLite.
For whatever it is worth, I'm sure there are some valuable features in postgres like WAL-e and WAL-g cloud storage Write-Ahead backups.
I started shaving this yak when I wanted to back up the content of my bookstack deployment which uses the Helm chart, that includes MariaDB as a dependency. I had a bit of a scare because the default behavior of the database chart is to generate some passwords and store them in a secret, and there's a bug that causes the secret to be overwritten sometimes, rendering the database inaccessible because the content has been stored with a password in the PVC.
I was able to resolve that by resetting the password using an init container and custom my.cnf temporarily disabling authorization so I could set any password with kubectl exec. Anyway, that got me thinking about other things that could go wrong, and I went looking for backup config in the bitnami chart, didn't find any options that looked quite as nice as my WAL config through postgres.
Are any of you using Kubernetes to host bookstack, and if so do you use on-cluster or off-cluster databases? And in any case how do you keep them backed up?
I understand how to take a backup just one time with mysqldump, but it seems like it would be much better to have it on a fixed schedule ideally recurring automatically.
I don't think my cloud provider does automated MySQL yet, so I'm on my own without some kind of support for backups. (This is my argument in favor of postgres, since it's much more broadly supported across cloud providers, and quite easier to keep perfectly backed up IMHO)
@ssddanbrown Just curious on the status of this. It has been over a year and multiple versions have released since this got switched from open for discussion. I know at least in my case I have no desire to deploy or deal with MySQL but am interested in this application. From discussion in this thread others are also interested, and as noted in #1442 GitHub may be a barrier of entry for some people to give additional input on this.
Hello @evitalis,
This has not really moved to be honest. I think I started looking to support SQLite at some point but came across a bunch of extra edge-cases then I got distracted so it just kind of dropped off.
Being totally open, Over the last year I've become more tired of supporting what I see as horizontal growth of the project (Supporting extra options and variants of what exist) rather than upwards growth (Supporting new features and improvements that benefit existing users).
I understand that this kind of thing would open the project to a bunch of new users, and that's fantastic, especially if those users are eager to use BookStack, but at the end of the day it would add additional maintenance and support effort and this is not a paid-for project where we're trying to get new customers, We're just trying to build a good documentation platform for certain use-cases.
It's kind of ironic that with open source projects, it feels so viable to broaden support and options since the code is just sitting there, out in the open, ready to be changed. Even more so than with close source projects, whereas the closed source options often have more resources to apply.
I mentioned in my last comment that we could add support for extra DBs but "make it clear that using those may be more unstable/less supported". Looking back at that, I think "Would I be happy mentioning them but not fully testing on them? No, I'll still need to test on all DBs when changes are made" and "Will I be anxious at work and feel bad if I see people report migration errors for these DBs, after a release? Of course".
I know that money has been offered above, and that's very generous, but I'm still uncomfortable with having money become a variable in the project and, while that may help implementation, it's the ongoing support & costs (Effort, not monetary) I'm more concerned with.
Don't think I'm discounting this though. This is still something I'd like to see, I'm just not sure we could support it to the level I'd want to right now. In regards to relative interest, This issue is in about 35th place in terms of upvoted requests.
Hope you have a wonderful evening, Dan
Thank you for the feedback but if laravel already supports this what additional code may be needed to get it working? I agree that sometimes horizontal work doesn't feel as fulfilling, but as an open source dev myself I can tell you it still has benefit.
I won't be able to deploy this in my environments while there is no PSQL support but I will certainly keep an eye on this for if it ever gets implemented.
Recommend closing this issue and #367
Bounty has been declined and the added compatibility/db migration and regression testing wouldn't even be welcomed with a PR, per Dan's clarification above. I very much appreciate the sensible and tactful response.
I didn't hear anyone say a PR wouldn't be accepted, but it's not the end of the world if this issue closes without a fix. Thanks for your thoughts.
I'll leave this open, so that it can easily be found by others, and it would be something I'd like to support, just maybe much further in the future. A PR would probably only be accepted if provided by a longer-term contributor that is familiar with the DB being implemented as was committed to being able to provide some level of support going forward.
Thanks everyone for your understanding.
I don't think my cloud provider does automated MySQL yet
Same problem here. The reliance on MariaDB is really making it a pain to deploy.
I have read all the comments on this issue. I understand both sides' arguments, however I'm still for adding PostgreSQL support. It seems like everyone shared a bit of their experience, so I will try to share my practical experience, as well. Recently, I finally set up my own Bookstack for my personal documentation, etc. I am really satisfied with how the software works and how great it is, so I decided to use it as our new documentation centre at our enterprise, as well. Now the thing in my case is
Being totally open, Over the last year I've become more tired of supporting what I see as horizontal growth of the project (Supporting extra options and variants of what exist) rather than upwards growth (Supporting new features and improvements that benefit existing users).
I can respond from my personal perspective, that adding PostgreSQL as an option would definitely benefit me as an existing user. 😄
Now let's be honest: MySQL is known for its supposed "simplicity". However, that is not a strong argument for how beneficial software is, especially in a high performance related area i.e. databases. So I think it would be fine dropping a bit of "simplicity" (again, for others and me MySQL would actually even be more difficult) for a plain better solution (in terms of options available in the SQL world).
P.S.: I use PostgreSQL for all my private servers, if possible. I have never seen a reason to use anything below it, except for super super small amounts of data, but that is a thing for SQLite, not MySQL.
That said, the following is directed at @yebyen.
Are any of you using Kubernetes to host bookstack, and if so do you use on-cluster or off-cluster databases? And in any case how do you keep them backed up?
I will be setting it up in a Kubernetes environment. How did you set up your instance, after all? Regarding your question, I will use an on-cluster database.
@theAkito I have been using kubedb, which for as far as on-cluster database solutions go, makes administering an MySQL or MariaDB cluster or machine about as easy as administering a PostgreSQL cluster... that is to say, you write a couple of CRDs and it is handled for you. I like this better than the built-in Bitnami mariadb that is included in the bookstack stable chart.
I will say also that since I began to follow this issue, DigitalOcean has since added support for MySQL managed instances, I would probably still go with an on-cluster database because it simplifies cost and performance management to have everything in the same interface, but I would rather go for something like KubeDB than a one-off volume managed database. I have more options for Postgres hosting than MySQL, but I can see the value of keeping things the same from a maintainer POV. I would not be as likely to adapt to a change that eschews MySQL entirely for PostgreSQL. Making such a change seems likely to create mistrust from users who were not planning on changing or being forced to change from one database format to another.
The most ideal solution IMHO would be adopting an ORM layer, if one supporting both MySQL and PostgreSQL was available. I don't know enough about PHP as a Rails dev, for us this is generally no-brainer, so long as you stay within the bounds of the safety net that the ORM provides for you. Having no frame of reference for what that picture is like in PHP land, I can't offer guidance on how to resolve this issue permanently, unfortunately. But I can understand too the perspective that all complexity that an ORM with driver architecture adds is unwanted, especially around database features like transactions and maybe something like composite keys that might behave differently from one driver to the next. It seems like the more databases are supported by an ORM, the more potential complexity it adds to the picture, (but supporting multiple database formats without any ORM sounds like a non-starter, from my perspective.)
One way to emphasize any limitations to the support of various databases in an ORM layer would be to add a test suite, that provides coverage only for the database formats that are officially supported. That way even if a database ORM provides support for three or four different databases, a contributor can show that only one or two of them are covered by tests. This strategy has the added advantage that community contributions can be accepted (if your database format is not supported by the project as of yet but happens to work just fine, add your own tests for that format in a PR. Congrats, now you're the maintainer of those tests!)
I think it's hard to strike a balance between "PostgreSQL is the de-facto winner for enterprise databases" and "MySQL/MariaDB is the de-facto winner for hobby projects" or however you see the properties and favorable characteristics of these databases. I love following this project, and I'm glad that it has included support for Kubernetes hosting. If the people maintaining are not interested in using an ORM or switching databases or database access strategies, I won't begrudge it myself. Open source efforts being on a volunteer-only basis, as it was in any case!
Read through, understand its a graft of work to offer separate DB Support, I'm happy to fork/branch and put in some work as we have been on PostgreSQL for a while.
Mysql is a great tool, I grew up on it, it is ideal to serve BookStack as a single serving solution, however, MySQL fails when you start to work in really large recordsets, so lots of tools, solutions, and companies have opted for Postgresql or one of the other solutions.
BookStack is by far one of my favorite projects I've come across in a while, its really well-conceived and highly executed, and if it had some flexibility in the data engine it would open up integration into other people's solutions.
However, this might be against the vision you have for BookStack.
Love what you're doing with the project, support it regardless.
We would also like to see PostgresSQL support. We have been using BookStack for the internal documentation of our project for a long time. In the meantime it has become our favourite documentation tool.
I thought I'd chime in to share my experiences as well.
As a non-enterprise home user, I have been using Postgres for years now. 99.99% of modern open-source projects support it. On a tiny computer, such as the Raspberry Pi, it is very light on resources by default. This is not so with MySQL, which uses far more memory when installed, though this can be configured.
I was very much against installing Bookstack in the past, not because it's low-quality software, quite the opposite in fact, but because it relies on MySQL. Even today, it is the only installed piece of software that uses MySQL and does not offer a different (and arguably better) database alternative.
This is not to say that I have regrets, nothing implements the book/chapter/page hierarchy the way BookStack does after all, not to mention superb accessibility considerations and support, I just wish we had more alternatives when deploying. Use cases can greatly vary and having a few alternatives is especially important when a project is as unique as BookStack.
There are a number of arguments why Postgres, or even Sqlite would be a great addition. Due to more sensible defaults, they are a lot easier to deploy and manage, as well as less resource hungry where resource constraints exist. This is particularly true if one wishes to run multiple BookStack instances on one computer.
If spending too much time on supporting alternative features rather than on the project itself is a concern, I agree with @theAkito, and rather have Postgres support over MySQL as well.
@Flameborn Why I agree postgresql support would be awesome (and actually prefered) there are, sadly enough, quite some applications that only support mysql/mariadb :(
I'd be also interested in Postgresql support.
@fabiscafe I doubt the devs are going to work sooner with more spammy +1 comments. You can just give a thumbs up on the main PR post ;-)
I have explored this a bit recently, and I'll put my discoveries here in case anyone would be interested in implementing it.
I will list only the issues I encountered with a functional main page and user settings. There might be other problematic queries further down the road.
My journey ends here. Hope that it will shred some light into this issue.
Any chances to get Postgres as db backend? For example if we have everything in Postgres on one server, additional process for Mysql could be extra memory costs
Current status on porting to Postgres (will update in realtime):
(Disclaimer: I run the test because it is the only way to know if BookStack is working properly. I have no idea if the test cases cover all cases)
PHPUnit 9.5.24
............................................................... 63 / 894 ( 7%)
............................................................... 126 / 894 ( 14%)
............................................................... 189 / 894 ( 21%)
............................................................... 252 / 894 ( 28%)
............................................................... 315 / 894 ( 35%)
.....................FF........................................ 378 / 894 ( 42%)
.................................................F............. 441 / 894 ( 49%)
...............FF....FFFFFFF...FF.........F.................... 504 / 894 ( 56%)
............................................................... 567 / 894 ( 63%)
............................................................... 630 / 894 ( 70%)
......F.F...................................................... 693 / 894 ( 77%)
............................................................... 756 / 894 ( 84%)
............................................................... 819 / 894 ( 91%)
............................................................... 882 / 894 ( 98%)
............ 894 / 894 (100%)
FAILURES!
Tests: 894, Assertions: 4386, Failures: 17.
I think this is doable, and I can provide extended support if a changeset is finally agreed on. However, this may take a long time as it involves former bad practices (Non-strict SQL) and changes on migration files require extra care.
Quote of previous comment
To be fair, besides postgresql to does also pose general improvement to the project itself :)
Just to be clear @wfjsw, as per my previous comments, even if you get to a working implementation I would not look to merge it at this time. My issue has never been the initial implementation effort and I still don't feel to be at a comfortable point for widening database support.
Of course you are free to maintain a fork with your changes but I will not go out my way to support that fork in any way. Feel free to post a link to your changes/fork when done or ready to share, otherwise I don't really want to distract this thread, with development conversation of that fork, here as I don't want people to misread those updates as official progress for support in the core BookStack project.
I will share my findings during the Postgres adaption and why I think don't feel to be at a comfortable point for widening database support
is an invalid excuse at all.
TL;DR: This is not an improvement, nor a new feature. It is a major BUGFIX for this application currently on life support.
The answer is No. As we are already blessed by the Eloquent and the database-agnostic migration utility, most of the databases can be driven in the exact same way, given you are using the latest version of the database available.
But why do we suffer from these Errors and Failures at the moment?
Let's be clear. You are not officially supporting either MySQL or MariaDB. You are only supporting them without the STRICT_TRANS_TABLES
or STRICT_ALL_TABLES
set, causing them to work in a so-called "tolerant" (compatibility) mode, which makes it not complain about the queries that violate the SQL standard. If Strict-Mode is enabled, the application will fail on it, just like it does on PostgreSQL.
Specifically, your database configuration:
NOT NULL
constraint, which is set by default of the migration engineDEFAULT
values. Some fields may get an 0
even if you do not explicitly tell them to do so.ONLY_FULL_GROUP_BY
is also disabled, which explains some GROUP BY errors I met during the porting process.It allows vague uses of the database in some part of the program, especially the Test Cases, where it create columns that are totally invalid and miss essential information.
This is the only part that requires different handling between MySQL and PostgreSQL. Usually, a MySQL database is initialized with a ci
collation, which stands for case insensitive
. This is helpful in some cases, but we are bringing yet another implication to those already confusing SQL bad behaviors in the long run. Nonetheless, we can adapt a ci
collation on PostgreSQL as a workaround.
It is 2022 now. MySQL 5.6 is long gone. MyISAM is also gone. There is no point keep developing features on this ancient and dangerous foundation, as more features come with more data integrity failures. Once the non-strict mode is deprecated and removed on MySQL the only thing we are left with is broken and unmaintainable code.
There is still time to fix this right now.
@ssddanbrown I wouldn't take what @wfjsw replied with personally on your (excellent) development skills, it's valid constructive criticism and pretty well said. Besides those points IMO Postgres is the superior relational database engine compared to MySQL/MariaDB.
@wfjsw
If there are reasons, with tangible benefits, to support specifically strict mode, for existing mysql/mariadb users, that can be raised via a new feature request if desired for specific discussion. If there are issues that are affecting existing mysql/mariadb users, or known/actual official deprecations on their way, again those can be raised as new issues.
Thank you for saying you don't find my reasoning or concerns here to be valid. I must admit, I found that quite insulting at first, but thinking further I would have probably have the same outside perspective 7 years ago.
To say that extra database support would require no extra effort is completely incorrect in my view, based on my experiences. Yes, Eloquent would take care of much of the actual functional work, but the fact there are currently incompatibilities shows that differences are possible, at those will have to be tested and considered when making any notable changes related to the database. Upon that, support against any Laravel changes would need to be considered, and we'd have to be watching out for potentially significant changes in Postgres also.
The above is just the technical side of things. The much more time consuming element is the social side of support which has really been my bigger concern since my 2019 comment above. The amount of time I spend providing free support to users is significant, and likely outweighs time spent on the code-base. Widening options and support, especially for something like additional database systems, widens support further. It's an extra factor I'd have to consider in each support request or issue raised. It's an extra factor that may be affecting things. As an example, I recently spent hours helping a user fix their corrupted data due to them sharing database file data across different database container versions. I recently spent hours tracking down, and providing a fix, for an issue that specifically occurred with a Ubuntu-specific MySQL update that happened to only cause issues when running in an LXC container while using a ZFS filesystem. My point here, is that support goes a lot wider than technical functionality.
I felt I made my view on accepting a PR previously clear. My last message was primarily to re-iterate my stance since it looked like you were spending time on this, and I didn't want you to be wasting time if your expectation was to get your changes merged. Please keep in mind, pressuring me to widen support, is pressuring me to commit more of my time on this going forward. I have to draw a line somewhere in regards to what I think is sustainable for both me and the project. Recently, after leaving my job last year to focus on BookStack, I have been thinking that maybe this kind of support widening may be viable, but the reality is that I'm currently loosing up to £1k per month in doing this, so may have to change circumstances again, and I don't want to widen the project to a point it won't be sustainable if it goes back to being a weekend/night only side-project again.
@ssddanbrown
Thank you for your response!
According to The MySQL Documentation 5.1.10, the SQL mode is changed in MySQL 5.7.5 to include ONLY_FULL_GROUP_BY and STRICT_TRANS_TABLES by default. The reason that the non-strict mode is no longer recommended is that the assumption of data integrity is broken af for the following two reasons:
NULL
. The database would not complain if invalid data is inserted, and in turn, extra efforts have to be spent on application code to detect these errors. Otherwise, it will turn into catastrophic failure which is hard to debug. 0
come from? And what about this ""
? And this NULL
? They are all different possible values, and users will have no confidence that their input is stored in the database as it is expected.Even if the application is going to stick with MySQL forever, these changes are still worth attention as it is dangerous if in the end, nobody knows whether this field should accept NULL, 0
, or ""
. During the porting process, I have to literally guess from the code and the validation rule to determine the intended behavior of a given field, and the result is not always apparent.
I will not create a separate issue for this topic because based on to the status of this issue, I think discussions on this matter without proposed code are meaningless. Next time I'll bring my code with me so we can review it together.
The Eloquent provides database-agnostic ORM by abstracting SQL implementation details away with these helper functions. It is not a silver bullet to suddenly makes all SQL statements valid across databases, such as *Raw
expressions, aggregate functions, and "
s. However, most of them can be performed within the safe border of Eloquent, and be cross-database-compatible, while the others are still usable with reasonable awareness of the SQL standards.
I have to admit that migrating existing databases from non-strict mode to strict mode may involve tricky procedures that can easily create chaos if some parts are oversighted. But as I have said, sooner is always better than later.
As the problem we are facing here is rather complex, I would not expect an immediate (or an eventual) merge if a PR is finally prepared. Instead, I hope that my PR will serve as a reminder or a road sign should this issue be brought up again in the future. Until now I have not spent too much time yet. The 1.9% failure rate was accomplished in less than 3 hours without coffee.
P.S. On a side note, I think stupid actions deserve paid support.
@wfjsw Thanks for the further info, but you have disregarded my request to open a new issue to discuss this, where the above information would have been much better placed, which also goes against my earlier wish to not distract this thread. A new focused issue would provide opportunity to properly discuss the value and approach for moving to supporting strict mode. If a wide-scale PR was opened up it's likely I'll immediately close it off until those points are discussed, any changes I may want to proceed with are probably thing i'd like to do slowly and surgically.
Any further discussion regarding technical implementation or database strict mode in this specific issue will be removed as it's not really productive to the core request of this issue as far as I'm concerned.
@ssddanbrown I think all relevant info is already placed here, and strict SQL support automatically brings us Postgres and SQLite support thus they are essentially the same epic as well.
If a wide-scale PR was opened up it's likely I'll immediately close it off until those points are discussed, any changes I may want to proceed with are probably thing i'd like to do slowly and surgically.
I'll have to emphasis that it is not possible for a change like this to not be wide-spread. Affirmatively it is possible to gradually change the way the application calls the database, but when it comes to database schema and migrations I'm afraid there is no surgical way.
it's not really productive to the core request of this issue as far as I'm concerned.
I'll have to put a big disagree on this. The strict mode is the key blocker on this issue and the SQLite ones. However, I agree that there is no point in further discussing this as we already have plain facts and the rests are just personal preferences that do not warrant lengthy discussion.
I think at this point it has been clearly indicated by @ssddanbrown that this is not something relevant for the project. I personally think that this would be quite beneficial in the long run, especially via SqLite, but I can accept that it's out of scope.
Since many of us are interested in this, may I suggest @wfjsw to create a fork that offers broader database support, where we could possibly discuss further steps?
Okay, think I'm just gonna have to lock this conversation off, but I will summarize for people coming to this:
If you feel you have valuable input to the core request here, or if you're seeing this a while later and want to check on the status, feel free to message on Discord.
In regards to a fork:
I think you should add support for, or change the database layer to Postgresql for future proofing and ease of developing new exciting features.
http://www.craigkerstiens.com/2012/04/30/why-postgres/
https://blog.lateral.io/2015/05/full-text-search-in-milliseconds-with-postgresql/
https://www.quora.com/What-are-pros-and-cons-of-PostgreSQL-and-MySQL
https://www.digitalocean.com/community/tutorials/sqlite-vs-mysql-vs-postgresql-a-comparison-of-relational-database-management-systems
http://www.craigkerstiens.com/2012/04/30/why-postgres/
How about 2226 interesting posts about how great Postgres is: https://hn.algolia.com/?query=postgres&sort=byDate&prefix&page=1&dateRange=all&type=story
When I switched the data layer from Mysql to Postgresql on my own project, I could immediately begin coding new exciting features into the app. These were great because I didn't have to use libraries or hardcore anything, just simply made calls to the database layer. Stuff like awesome searching capability and many other great features.