griffithlab / civic-server

Backend Server for CIViC Project
MIT License
39 stars 32 forks source link

Dockerize civic application #701

Open forus opened 2 years ago

forus commented 2 years ago

Hi there, I could not find a Dockerfile in this project nor a repository that would build docker images in the git organisation.

Is it already implemented somewhere? If not, would contributors be interested to accept such change?

pieterlukasse commented 2 years ago

Hi @acoffman, any pointers on this one? We can start on it soon, if there is nothing else out there. Please let us know! CC @malachig

acoffman commented 2 years ago

Hi guys, thanks for reaching out and apologies for the delayed response. You are correct that there is no Dockerfile for this project. We would be happy to accept such a change, however, you've caught us in a bit of transitional moment as we are in the final sprint to launching V2 of CIViC which is a new codebase (https://github.com/griffithlab/civic-v2).

While we intend to keep the v1 API alive for some time for backwards compatibility, eventually, the new version will entirely replace this repo. Given that, it may be more productive to Dockerize the new codebase.

Its a fairly standard rails deployment with nginx proxying requests to puma over a unix socket, sidekiq used for a task queue and redis used for the queue itself, with postgres as the primary data store. Systemd unit files and nginx config are in the new repo, as is a github action workflow that builds the frontend and deploys the whole application, which should be a good starting point.

If this is something you're interested in tackling, I can also share notes on the dependencies you'll likely need. Also, you can check out the new version at https://staging.civicdb.org.

susannasiebert commented 2 years ago

To add to what @acoffman I think we would also be interested to hear your use-case for a CIViC docker container. While you will be able to populate your initial database with data from production CIViC, you wouldn't easily be able to pull updates from it or feed the data from your instance back into production CIViC. If you are planning on just running a fully standalone instance then that wouldn't be a problem, though.

pieterlukasse commented 2 years ago

@acoffman, @susannasiebert thanks for sharing all these details. Good to hear that this project is doing so well and that a new version is coming out!

We are still interested in the v1, given that the application we want to integrate this with is expecting v1 version of the API. Having said that, my question to @acoffman is: how different is the v2 API? I might have to start planning a migration of our code at some point. Can you please also let us know for how long the v1 will be supported with new data releases? There is a chance that our use case will require us to update our local instance of the Civic DB once every ~3 months or so. If v1 is sunsetting soon, it would be nice to know so we can take that into account.

@susannasiebert given our use case (having a local copy of the Civic DB), can you please help us find a recent dump of the DB that we can use to populate our instance? I found this one, but it is >1 year old: https://github.com/griffithlab/civic-server/blob/staging/db/data.sql.gz ... and if I understand the README correctly, it is just a "sanitized version" for dev purposes. Where can we find a recent dump? Thanks!

acoffman commented 2 years ago

The V2 API is substantially different as it is using GraphQL rather than the previous REST-like approach. However, we are more than happy to provide guidance on porting your existing queries to the new API if you'd like, it should be relatively straightforward!

We have not selected a precise sunset date yet but V1 API will remain up for months in order to give our existing users time to transition and when we do decide on a date, we will announce it on the site, our twitter account, etc. That being said, while you can expect the API to continue to function, the data itself will begin to get stale as V1 is in read-only mode.

The "sanitized" SQL dump contains all the data from CIViC in full, the only sanitization going on there is the redaction of certain user-specific things such as OAuth UIDs, email addresses, etc but you're correct that the copy in the repo is dated at this point. We can generate a new one for you.

pieterlukasse commented 2 years ago

@acoffman thanks for the extra details and the offer to help! I'll make sure to reach out if we get to the V2 migration part of our work!

A new version of the SQL dump would be great for the time being. Please let us know, and we'll test it as soon as we can!

acoffman commented 2 years ago

I have pushed an updated sql dump which can be found in db/data.sql.gz

pieterlukasse commented 2 years ago

Awesome! Thanks for your help @acoffman! We will test it and report back here.

acoffman commented 2 years ago

@pieterlukasse Apologies, I realized I inadvertently pushed a data dump with the V2 schema - that will not work with this app. I have corrected it; please be sure you have commit 3a9bd1e963d4759cdaef72b36d0b72a29c9fdd20 for a compatible file. Sorry if you guys have already started loading it in.

pieterlukasse commented 2 years ago

@acoffman no worries! Thanks for the heads up 👍

pieterlukasse commented 2 years ago

@acoffman the data dump seems to work fine. Thanks again. We'll follow-up with a PR soon!

ruslan-forostianov commented 2 years ago

@acoffman We're getting close to working version here https://github.com/ruslan-forostianov/civic-server/tree/dockerize

We are getting a different behaviour with the public civic server while testing though. The below link gives 414 Request-URI Too Large on our version, but not on the public one.

Does this issue sound familiar to you? BTW We've based our branch on the staging branch. I'm assuming you have master branch deployed to production?

https://civicdb.org/api/genes/9931,266727,284161,55275,26146,56603,10326,113178,84812,55619,9611,3115,79925,6233,54894,1767,57223,1770,7781,5675,9262,5169,2354,1285,10483,85451,8943,26036,4185,57674,433,202559,28965,9424,1781,2064,10605,79083,113277,23626,29841,91752,80856,9765,79627,3777,1553,29958,6829,9256,126068,8853,57513,3488,148066,56097,25780,80169,10767,64763,23236,100271715,10951,8698,124842,51265,162540,4794,83394,129285,10137,84816,151056,64780,23671,3991,56147,85413,54520,57224,246243,10332,80341,83439,7153,2639,55184,3885,64772,254228,339229,8632,27332,1300,79908,23105,431705,27151,5430,4626,5884,126308,5176,1769,6830,9703,84951,442184,11262,10107,2295,352999,4867,56146,94097,79157,1385,114898,11122,162427,85021,5707,222584,2729,79891,284403,114880,4036,1101,2879,80315,25885,64924,4995,60681,9400,7456,96459,54862,10290,57404,57695,63946,56099,26133,162514,153769,79755,55781,1493,643155,177,92270,8165,6626,134957,11250,285175,340061,22847,1657,51645,7840,64411,23,9208,9590,29940,124989,5626,2867,6217,29785,4214,3562,6492,30817,10749,339327,84532,54867,10040,6510,23332,6293,667,63926,23140,124997,8202,25929,57835,30850,6869,56145,10973,23600,65095,51692,5496,56886,5734,57679,50861,85464,3769,3554,6517,253558,4318,165,23228,58508,26018,11091,2055,5157,221981,23189,2658,2918,26986,6709,9211,4603,169026,441459,169611,221955,285335,79581,54606,114,1499,79872,55669,25897,2675,644150,43,10497,1621,93035,346653,26173,4143,6156,25890,1358,11101,55328,3955,57704,5253,55799,90459,22898,165904,8829,10641,288,10276,2049,9760,56254,81796,57705,3201,401565,5649,25851,26047,9583,6441,222255,55209,8621,3309,55035,56987,64599,1955,8893,84376,2912,7474,2845,83592,55753,1515,22978,8021,23424,10142,2199,727,3559,10174,10730,51196,11280,221037,643236,55904,5708,256130,23660,6674,6595,79009,64816,219623,84629,402573,11253,9223,10643,8727,80723,51373,23218,131034,285313,4634,5742,80256,138311,26872,84892,22984,2167,407738,23225,51101,286148,79690,25981,3786,8767,6091,5998,83860,8895,389677,7348,80005,1558,4957,79776,286234,654463,23223,25924,774,7879,138474,7871,8723,375748,80243,27147,399814,669,138255,51013,23043,221078,51435,155185,5069,27019,3551,84665,7516,203245,36,340745,55287,7038,7410,157285,55069,9881,91074,55806,166336,3757,84132,1795,157570,23303,9644,346389,56163,144132,221120,6602,54436,2200,3671?identifier_type=entrez_id

acoffman commented 2 years ago

Based on your Dockerfile it looks like you're serving the rails app directly with WEBrick (the default when using rails server and just in the ruby stdlib). That is not really intended to be a production quality webserver out of the box which is likely why you're seeing that issue.

V1 of CIViC is served with the open source version of https://www.phusionpassenger.com/ while V2 is served with nginx proxying to puma (https://puma.io/) which is a more widely used approach than Passenger nowadays.

While its generally best practice to put it behind a reverse proxy of some sort to handle things like ssl termination, cache headers, and static asset serving, you could potentially expose puma directly in your docker container instead of webrick and see if that's sufficient for your workload.

Adding puma to your gemfile and doing a bundle install should be sufficient for rails to pick it up, and you should see a message indicating the app is being served by puma rather than webrick when you start it with rails server.

You will also likely want to make sure that you're running rails in production mode rather than development mode, otherwise it will do things like reload the code on each request. Handy for development, not so much for real use.

ruslan-forostianov commented 2 years ago

@acoffman Thanks! Please review PR https://github.com/griffithlab/civic-server/pull/704