mblayman / markwiki

:warning: UNMAINTAINED: A simple wiki using Markdown
BSD 2-Clause "Simplified" License
29 stars 14 forks source link

Search requires file system storage #17

Open mblayman opened 10 years ago

mblayman commented 10 years ago

Problem: The search tool used by MarkWiki, Whoosh, stores its index exclusively on the file system. For cloud environments like Heroku which do not support persistent file systems, the search system will break when the index disappears.

Solution: There are multiple options:

  1. Make the search feature optional so environments like Heroku could disable it.
  2. Work with the Whoosh project to add a database backing storage option so that an index could be stored in a relational database like Postgres (which Heroku does support).
  3. Convert MarkWiki's own storage format to a database format and toggle between using Whoosh if functional or something like Postgres' full text support.
cabalamat commented 10 years ago

I'm surprised Heroku doesn't allow saving data to a file. Key/value stores are very much in vogue, and isn't a directory just a very simple key/value store?

mblayman commented 10 years ago

Well, to be fair, they do allow writing to files. What they are very clear on is that the filesystem is temporary. Since apps are deployed to virtual machines and moved constantly, Heroku indicates that all apps should be prepared to be killed and restarted at a moment's notice. Upon restart, what was written to the filesytem is unlikely to still be there so data must be persisted elsewhere.

I checked the Google group for Whoosh and found someone who was experiencing exactly the problem that I predicted. Things would seem to work at first, but within 24 hours, his searches would stop working. It's clear that the index must have disappeared right out from under him. He did not read the Heroku docs carefully enough. :)

I emailed the Whoosh author and he is close to releasing Whoosh 3.0 which will allow for different storage models so I'm planning to make this change later. I might make search optional if I get the other persistence pieces done. That way someone could conceivably deploy to Heroku (even if search was disabled).