Would like a guide for How-To deploy Amundsen in production

jornh commented 5 years ago

Please add points on what you expect from such a guide in a comment below. I will then try to consolidate input and draft up an outline in this comment.

The guide can end up as ~~/docs/deployment.md~~ is /docs/owners_manual.md better?

Initial outline:

[ ] Basic install of services (in different environments)
- [x] Docker-compose “vanilla”, but with Gunicorn (WIP #109) ~~data in volumes etc.~~
- [ ] AWS ECS. original PR: https://github.com/lyft/amundsenfrontendlibrary/pull/216 (or EC2): https://github.com/lyft/amundsenfrontendlibrary/issues/186
- [x] Kubernetes helm chart install ~~(convert from Compose using https://kompose.io?)~~ (upcoming PR see https://github.com/lyft/amundsen/issues/53#issuecomment-538575978 below)
[ ] Setting up ingest (with or without Airflow, see https://github.com/lyft/amundsen/issues/53#issuecomment-617370073)
Figure out which parts of this belongs with Architecture.md and which in Databuilder repo?
- [ ] Compared to Quickstart ingest (https://github.com/lyft/amundsen/issues/75)
- [ ] Then mention source by source; Extractor(s), Model, Metadata
- Table Metadata:
- Users
- Table Usage: (How it works and why in https://github.com/lyft/amundsen/issues/381#issuecomment-613387814)
- ...
[ ] Configuration - custom build of frontend (to not have to maintain a fork we need to get https://github.com/lyft/amundsen/issues/408 transmogrified into proper documentation/tooling)
- [x] Small tweaks to turn on/off features, adding logo etc. (mostly Done) https://github.com/lyft/amundsenfrontendlibrary/commit/c256115f7d64da121de4ea36ea9c55592c11f9d5 in PR https://github.com/lyft/amundsenfrontendlibrary/pull/255
- [x] Config of email notification/feedback Done in PR https://github.com/lyft/amundsenfrontendlibrary/pull/291
- [x] Data preview (integration to SuperSet) - https://github.com/lyft/amundsen/issues/27#issuecomment-517477074 has some draft contextual lead in and reasoning and a link to example setup. But ultimately what ticks off the box for this is Taos Guide in https://github.com/lyft/amundsen/blob/master/docs/tutorials/data-preview-with-superset.md (or on the https://lyft.github.io/amundsen/ site, search for SuperSet!)
[ ] Security
- [ ] Auth - passwords etc.
- [ ] secure communication
- [ ] production grade docker as per Production-ready Docker images (via https://www.youtube.com/watch?v=cDzFm68aMao)
[x] Backup - initial WiP in https://github.com/lyft/amundsen/issues/53#issuecomment-516159598 below ... current result in https://github.com/lyft/amundsen/issues/381#issuecomment-614534794 - and restore (on K8s) implemented in https://github.com/lyft/amundsen/pull/394
[ ] Monitoring (statsd etc.?)
[ ] Handling upgrades
[ ] ....

joaolcorreia commented 5 years ago

AWS could be common for deployment, possibly using https://aws.amazon.com/ecs/?

jornh commented 5 years ago

Neo4j backup and restore

Install the Neo4j APOC plugin (in a folder next to your example/docker/neo4j/conf/)

    mkdir example/docker/neo4j/plugins
    pushd example/docker/neo4j/plugins
    wget https://github.com/neo4j-contrib/neo4j-apoc-procedures/releases/download/3.3.0.4/apoc-3.3.0.4-all.jar
    popd
    mkdir example/backup

Add volumes for plugins + backup in amundsen-docker.yml:

          volumes:
              - ./example/docker/neo4j/conf:/conf
              - ./example/docker/neo4j/plugins:/plugins
              - ./example/backup:/backup

Start containers,

Docker-compose -f docker-amundsen.yml up

ingest data via Databuilder

In the Amundsen frontend web, change descriptions. Maybe add owners…

In the Neo4j web console

CALL apoc.export.cypher.schema('/backup/amundsen_schema.cypher')
CALL apoc.export.graphml.all('/backup/amundsen_data.graphml', {useTypes: true, readLabels: true})

Delete the Neo4j graph (still in the Neo4j web console):

MATCH (n)
DETACH DELETE n

Restore the backup (yep, you guessed it, still in the Neo4j console) :

CALL apoc.import.graphml('/backup/amundsen_data.graphml', {useTypes: true, readLabels: true})

ToDo:

Figure out where CLI/cron job should live: as part of metadata - as shell/cron (wrap in airflow) - as Databuilder - as Airflow Operator
Test volume add works - does not break for non-existing plugin/backup in repo (or add KeepFolder file)
Check under what circumstances restore of Schema is needed

Related: #196 and slack thread with some script snippets etc

jornh commented 5 years ago

@ttannis we lost access to the useful content of former FE issue https://github.com/lyft/amundsenfrontendlibrary/issues/186 referenced in the snippet shown below.

Can that content be salvaged somehow? E.g. will transferring https://github.com/lyft/amundsenfrontendlibrary/issues/186 in a closed state to here do it?


Basic install of services (in different environments)
Docker-compose “vanilla”, but with Gunicorn, data in volumes etc.
AWS (ECS PR): lyft/amundsenfrontendlibrary#216 (or EC2): lyft/amundsenfrontendlibrary#186
Kubernetes (convert from Compose using https://kompose.io?)

ttannis commented 5 years ago

Transferred that closed issue over: https://github.com/lyft/amundsen/issues/77

jornh commented 5 years ago

Thanks for the quick turnaround on this @ttannis - seems to work nicely!

Also please extend my thanks to other Lyft team members on the recent even higher systematic focus on grooming PRs etc. I think going forward that will really encourage more to hopefully contribute even more!

javamonkey79 commented 5 years ago

@jornh I have amundsen on aws eks + k8s + helm now; I will put up a PR next week with docs; I'm not sure if it will fully fulfill this story, or, if I should put up another one. wdyt?

jornh commented 5 years ago

Great @javamonkey79! I think it should definitely tick the Kubernetes box above (I edited a bit above).

Just push what you think is suitable to cover Kubernetes on it's own and we'll figure the rest out later, when there's some good pieces of content it's easy to shuffle around afterwards if needed.

Right now I'm thinking the list above should end up as just a jump list or "annotated ToC" for what the sys-admin would like/need to know. Haven't really figured out how much or little prose will be needed to glue it together... Thoughts are welcome! 😜

javamonkey79 commented 5 years ago

Ok @jornh I've got the PR up here; I've opted to not include the aws setup at this point, as it is tied to our org a bit. I might add it later, if there is enough interest. cc @markgrover @feng-tao

fBedecarrats commented 4 years ago

Great suggestions here! I'd like to emphasize the need for a more explicit documentation on how to set up Airflow to handle ingestions of ES after Neo4J editions. From an outsider perspective, it remains quite a mystery , although Airflow (or something filling this function) is clearly a 4th microservice indispensable for the other 3 to work.

jornh commented 4 years ago

@fBedecarrats Airflow has its own documentation. So we’ll probably just reference that.

But the gist of it is:

Setup Airflow depending on how “serious business” this is for you it ranges from:
- just pip install it on a box where it can run, probably in a Python virtual environment of its own for good measure.
- install it in a container based setup - including a separate postgres/other database - a popular/easy Docker image til now has been “puckel” (just google “Airflow puckel” and you see it). but just recently the Apache Airflow project itself are starting to make their own official image. Not even sure if it’s still “beta”.
  - there are btw also the option to get Airflow as a SaaS solution from Astronomer - or Google
pip install amundsendatabuilder and other required dependencies (database drivers) on top of your Airflow
add your DAGs (the databuilder PyPi package doesn’t include the examples folder from the repo
When you upgrade make sure to keep databuilder and your Amundsen services in sync regarding compatible versions.

teqonix commented 4 years ago

@jornh Thanks a ton for putting those instructions together - I'm currently investigating how to implement Amundsen and backup / restore was high on the list.

Is there a good way to have ElasticSearch re-index data that was restored into Neo4j? I'm getting search errors after a Neo4j database restore even though I see the expected data post-restore in the Neo4j console.

I found that I can re-run the amundsendatabuilder job on the same data source and the my restored data appears on the FE again, but that seems like a hackjob.

jornh commented 4 years ago

It’s merely a wishlist 🙂 (with links to “state of the union” - but luckily bit by bit I can tick boxes) glad to hear the list is useful to someone. So, thanks for your comment.

To answer your question: Elasticsearch and Amundsensearch doesn’t have a will of their own on what data to serve. So what you call a hack with re-ingesting reindexing through Databuilder is actually the way to update ES data. I think for a, hopefully rare, restore scenario that’s okay. Hope that clarifies...

Do you have ideas for a different way?

stewartbryson commented 4 years ago

@jornh I have amundsen on aws eks + k8s + helm now; I will put up a PR next week with docs; I'm not sure if it will fully fulfill this story, or, if I should put up another one. wdyt?

I'd be interested in the helm chart.

jornh commented 4 years ago

@stewartbryson see https://www.amundsen.io/amundsen/k8s_install/ + the Amundsen Slack also has a #kube-helm channel for discussion

teqonix commented 4 years ago

Thanks for the clarification, @jornh - if that's the best way to go about a restore scenario, then that works for us. :)

I honestly don't have any other ideas; I barely have the skillset to implement Amundsen, much less understand the inner workings 😅. Again, really appreciate the help and your documentation!

kathleenrice commented 4 years ago

Hi, we have been trying to stand up Amundsen on Kubernetes but can't get the pod for Neo4j to deploy... Did anyone else have this problem?

dorianj commented 4 years ago

I'm going to pick this up. I think this will be a nontrivial project, mostly in the form of soliciting feedback from the community. Part of the appeal of Amundsen is its flexibility: there's no one right way to install it. However, for a guide to be broadly useful, I believe it needs to have concrete steps. As a result, we'll need to make some opinionated decisions in order for the guide to be useful.

Here's how I'm planning on structuring this project:

Create a skeleton of docs following @jornh's already-excellent outline. I will fill some of the "easier" details, and will leave anything nontrivial with as specific TODO as I can. I will solicit community feedback on this doc in a PR. I'd like to land it into a feature branch.
Based on feedback in (1), I will modify structure if needed. Additionally, I will fill in nearly all of the TODOs, including ready-to-run commands. I will open another PR and invite another round of community feedback (now that there is more substance to disagree with 😄 )
Once I address the feedback from (2), I will ask for one final round of feedback. In particular, I'd like to get at least one community member to run through all of the instructions command-by-command to ensure that it actually does what it says on the tin. At this point, I would like to merge it to mainline and promote the guide on the main readme. It will not replace installation.md (that guide is appropriate for someone who is just trying to get the thing working without source control or customizations), but instead will supplant it

If anyone has thoughts about this process, happy to hear.

There's some question as to which docs should be in the top repo vs service repos. My only strong feeling is that there be a single top-level doc that one can follow and find everything they need. Procedurally, it's much easier to make changes to the docs if they're all in one repo, rather than scattered between them. And given that the individual components aren't super useful when used independently, I default to just putting it into the larger repo. Open to feedback.

jornh commented 4 years ago

@dorianj that sounds like an awesome plan! I'll refrain from giving more feedback until you have passed step 1. 😉

dorianj commented 3 years ago

hey -- we've packaged some of the learnings from this thread and other places into a recommended pathway https://medium.com/stemma/amundsen-deployment-best-practices-740a1800518e -- would love anyone who's worked through this stuff to try it out and give feedback, we'd like to eventually get this upstreamed into main repo once it's better battle tested

fBedecarrats commented 3 years ago

Hi, we finally decided to start working with Apache Atlas. I guess we'll consider later adopting Amundsen as an alternative front-end.

corridordigital commented 3 years ago

Does anyone use Ansible roles for deploying and managing Amundsen ? I could share mine if that is of any interest (on-premise compose installation).

riteshmk commented 3 years ago

@dorianj , could you Guide me on the installation of Amundsen without Docker ? docker being paid for the commercial use or require enterprise license would take the benefits of open source usage for the enterprises.

Any suggestions. Appreciate support here.

korjavin commented 2 years ago

A year passed. Still even is not clear how to make auth.

amundsen-io / amundsen

Would like a guide for How-To deploy Amundsen in production #53

Neo4j backup and restore