freeCodeCamp / chapter

A self-hosted event management tool for nonprofits
BSD 3-Clause "New" or "Revised" License
1.92k stars 360 forks source link

Elasticsearch as a hard requirement #32

Closed vorpalhex closed 4 years ago

vorpalhex commented 4 years ago

Currently ES is listed in the tech stack. ES is a great tool and it's absolutely best in class for it's purpose of enabling powerful search, but as someone who has maintained a whole mess of ES clusters, it also comes with a lot of cost and complexity.

ES requires multiple nodes, it's authentication mechanisms are expensive and require enterprise licensing and there are limited SAAS hosts available. In addition, it adds significant hosting complexity (shard rebalances, kibana hosting, etc) and makes development and tests very complex, even with Docker.

If one of the goals of chapter is to be easy and free-ish for small orgs to host, then ES shouldn't be a hard requirement. It can be an upgrade to enable more powerful search for dedicated hosts, but we shouldn't design against ES as a requirement. There are several other ways to enable meaningfully powerful search, including a solid full text search capability in Postgres.

We can lower the development cliff to be involved significantly, and help enable small scale hosting of instances dramatically by leveraging alternatives, without compromising MVP features.

chrismgonzalez commented 4 years ago

@vorpalhex thanks for your input! This is definitely feedback to consider when making final decisions on the stack.

jackbravo commented 4 years ago

Here are a couple of good articles on postgres and search:

portenez commented 4 years ago

I'm sure people know, but to be explicit you can buy ES as a service from AWS. This way you pay with money, not with work: https://aws.amazon.com/elasticsearch-service/

ScottBrenner commented 4 years ago

Here's a fun read on Amazon Elasticsearch - AWS Elasticsearch: a fundamentally-flawed offering

iansltx commented 4 years ago

-1 on ES as a hard requirement as well, particularly for MVP. While it's easy enough to grab the Docker images for ES, include it in Compose, and get a dev setup up and running, doing the same in an environment that's visible from the Internet is a bit more fraught/easier to get wrong. Or you pay AWS to do it, hope you get IAM roles/VPCs right, and trade some problems for others.

Versus having the fuzzy searching etc. fun that ES provides as an optional add-on, where the base infrastructure requirement for the app is "an app server that runs node" and "a database server that runs Postgres"...AKA fewer things to break :)

H-Plus-Time commented 4 years ago

github.com/valeriansaliou/sonic might actually be a decent alternative to elasticsearch, given it's simplicity, the low bar for search (meetup.com pretty much only does tags and order-independent single word matching), and 'functions on a potato' resource requirements.

dmmulroy commented 4 years ago

I agree with @jackbravo and @vorpalhex. Postgres is already a decided part of the stack and it often doesn't get enough credit for what it can handle, we should try utilize it until we outgrow it. Adding something like ES just increases upfront complexity.

kognise commented 4 years ago

I think one of the express pros of ES is that it can be used to search across all instances of chapter (see #33) - I'm not experienced enough with search tools to know how this plays in, but I think we should consider this when choosing a search tool.

dmmulroy commented 4 years ago

Even at the scale of tens of thousands of organizations I don't think ES buys that much for searching when compared to Postgres, esp. if the main concern is just searching organization name or location.

I'd also like to point out this comment about Postgres being over kill if it's a self hosted solution: https://github.com/freeCodeCamp/chapter/issues/54#issuecomment-542784605

allella commented 4 years ago

If the intention of the project is to bundle up a simple, one-click container install then is it even possible to include ES? The OP said it's costly and complicated. If the reason we're building a distributed app is because Meetup isn't free then it seems like anything besides a hosting cost is counter to the project.

I'm not experienced with Postgres, but folks I know in town that use it won't stop talking about all the magic it can do, so presumably it has a solid search capability.

Cheukting commented 4 years ago

I also think that ES is good for searching across chapters worldwide but not necessary for local meetup groups. Most of the time Postgres is enough to support a web app.

francocorreasosa commented 4 years ago

I also think using https://github.com/valeriansaliou/sonic could be a great alternative since is lightweight / easier & cheaper to host.

mc962 commented 4 years ago

I agree with others, and think it would probably be good to wait until there are enough things to search for using ES, before adding it.

I think ultimately having it is a good longer term goal, and I've been having similar ideas about having an ES instance that allows searching across all groups worldwide (like a Google for chapters almost). But (and keeping in mind I haven't checked how much data is in the app yet if any), I don't feel like ES really brings all that much if there are only like 20 records to search through.

jacobbogers commented 4 years ago

a more simple stack

If a relational database is already being used, it might be interesting to reduce the stack by one component less (no elastic), I am not familiar with full text search on mysql or postgresql, so I can't compare with elastic other then comparisons from online docs.

free auth plugin for elastic

If needed, there is an authentication plugin for free for elastic (or you can write your auth (or whatever) own plugins in java in any case). https://readonlyrest.com/free/

ease of use

We have been using ElasticDB for a year, I think in this case the total ELK stack is not needed, if you only want elastic DB (search feature via http-REST), (so no Kibana and no Logstash,.. etc etc), We also written our on management scripts (bash/sed) for elastic, I like the fact, no driver is needed, you just talk to elastic via HTTP Rest.

allella commented 4 years ago

I think everyone is on-board with ES not being part of an MVP / hard requirement. I've posted on #47 and unless a good reason surfaces to keep this open then we'll close it.

QuincyLarson commented 4 years ago

Thanks for everything you all have shared here - especially @vorpalhex for broaching the subject and @jackbravo for the links.

You all have successfully swayed me to the "let's stick with Postgres for now" approach.

If it becomes clear that we need something more powerful later on, we can consider adding ElasticSearch. But for our MVP we shouldn't include it.

allella commented 4 years ago

We'll want to remove Elasticsearch from the README. I'll try to submit a PR that updates the docs with the most recent changes.