CovertLab / wcEcoli

Whole Cell Model of E. coli
Other
18 stars 4 forks source link

MongoDB setup instructions outdated #792

Closed ggsun closed 4 years ago

ggsun commented 4 years ago

It looks like MongoDB updated their website's interface and the instructions given in wholecell/fireworks/README.md to retrieve hostnames and port numbers are no longer valid. We were able to get around this by creating a new database from an existing account (which still supported the old interface) but we should update the instructions for any new team members in the future.

1fish2 commented 4 years ago

Is this doc now sufficiently up to date on mlab changes? It does mention the pymongo client change that requires adding:

mongoclient_kwargs:
  retryWrites: false

to work with an older mlab MongoDB server. (Is there a way to update those servers? Recreate them?)

ggsun commented 4 years ago

@1fish2 Sorry for not getting to this comment earlier! The issue here has more to do with new members being forced to use a web UI that is completely different from what the documentation was based upon. I was able to follow steps 1-10 just fine despite the UI changes (although these will need updates since the names and colors of many buttons have changed) but step 11 was problematic because the new interface does not seem to display this information (hostname and port) anywhere.

1fish2 commented 4 years ago

Ah, mlab.com won't create any new accounts. It points to MongoDB Atlas instead.

It was not so easy to get it to work. @ggsun , please try this out. If it works, I'll edit the .md file and requirements.txt.

Example my_launchpad.yaml file for Atlas:

authsource: admin
host: mongodb+srv://wc_ecoli:«REDACTED»@cluster1.9zutr.azure.mongodb.net/ecoli1
logdir: null
mongoclient_kwargs: {}
name: null
password: null
port: null
ssl: false
ssl_ca_certs: null
ssl_certfile: null
ssl_keyfile: null
ssl_pem_passphrase: null
strm_lvl: INFO
uri_mode: true
user_indices: []
username: null
wf_user_indices: []

@prismofeverything do you know of an easier way to use Atlas? There's no obvious way to make a single-server DB rather than a 3-server cluster, which seems to mean we need the dnspython pip and the rest of this complexity.

prismofeverything commented 4 years ago

Hey Jerry,

That does look pretty rough.... part of the problem of depending on a free service I guess. We could deploy mongo and run it ourselves, just not in the sherlock environment. Perhaps secure the one we have running (or a new one preferably) in our google cloud project so that our processes on sherlock could access? Would probably be the most stable, though last time I looked into that it was fraught with its own issues. Getting some kind of

$10 a month server somewhere to host it is another option that avoids all the problems with exposing our gcloud servers to the network. These seem like some good options: https://www.techradar.com/news/cheap-vps-hosting-deals

On Wed, Aug 19, 2020 at 12:12 AM Jerry Morrison notifications@github.com wrote:

Ah, mlab.com won't create any new accounts. It points to MongoDB Atlas https://cloud.mongodb.com/ instead.

It was not so easy to get it to work. @ggsun https://github.com/ggsun , please try this out. If it works, I'll edit the .md file and requirements.txt.

  • Create an Atlas account at https://cloud.mongodb.com/ with a strong password (this account will be accessible to the entire Internet)
  • Create a free cluster
    • Pick a nearby data center with a free tier, e.g. Azure in California or AWS in Oregon
    • Cluster tier: M0 Sandbox (free)
    • Pick a cluster name like Cluster1
    • Click "Create Cluster" and wait for it to provision
  • Under "Database Access", do Add New Database User, e.g. wc_ecoli with a strong password
    • Edit the user to add "Atlas Admin" user privileges
  • Under "Network Access", Allow Access from Anywhere, 0.0.0.0/0 (or figure out IP addresses for Sherlock and your computers)
  • Under "Clusters" (or select the Project to see the list of Clusters) click CONNECT

    • Choose a connection method to view how-to:

      • Shell: It shows how to install the mongo shell and the CLI command to connect it to this cluster, e.g. mongo "mongodb+srv://cluster1.9zutr.azure.mongodb.net/" --username wc_ecoli
      • Connect your application:
        • Pick the "driver" "Python 3.6 or later" (that's the pymongo version, not the Python version, and we haven't updated to pymongo 3.11.0)
        • Copy the connection string, e.g. mongodb+srv://wc_ecoli:@ cluster1.9zutr.azure.mongodb.net/ ?retryWrites=true&w=majority
      • To use this cluster with Fireworks: pip install dnspython # needed to access a mongo cluster python # you'll need the URL-quoted password for the DB user "wc_ecoli"
    • import urllib.parse

      urllib.parse.quote('YOUR_WCECOLI_USER_PASSWORD')

      exit()

    • Create a my_launchpad.yaml file using an lpad command or by cribbing from mine, further below: lpad init -u Enter host parameter: mongodb+srv:// wc_ecoli:URL_QUOTED_WCECOLI_USER_PASSWORD@clusterf.9zutr.mongodb.net/ecoli1 Enter ssl_ca_file parameter: Enter authsource parameter: admin

  • If you want to run another workflow at the same time [Does anyone do that?], add a DB for it to your mongo cluster:
    • cp my_launchpad.yaml ecoli2_launchpad.yaml
    • Change the DB name at the end of the host parameter, e.g.: mongodb+srv:// wc_ecoli:URL_QUOTED_WCECOLI_USER_PASSWORD@clusterf.9zutr.mongodb.net/ecoli2
    • lpad -l ecoli2_launchpad.yaml reset

Example my_launchpad.yaml file for Atlas:

authsource: admin host: mongodb+srv://wc_ecoli:«REDACTED»@cluster1.9zutr.azure.mongodb.net/ecoli1 logdir: null mongoclient_kwargs: {} name: null password: null port: null ssl: false ssl_ca_certs: null ssl_certfile: null ssl_keyfile: null ssl_pem_passphrase: null strm_lvl: INFO uri_mode: true user_indices: [] username: null wf_user_indices: []

@prismofeverything https://github.com/prismofeverything do you know of an easier way to use Atlas? There's no obvious way to make a single-server DB rather than a 3-server cluster, which seems to mean we need the dnspython pip and the rest of this complexity.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/CovertLab/wcEcoli/issues/792#issuecomment-675895943, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAACG527VXWPDUTZVHNN5NLSBN3NLANCNFSM4KLK7UTA .

1fish2 commented 4 years ago

The issue with Atlas is it builds a cluster of mongo servers even for a free sandbox. I couldn't find a way to make it build a single server.

To connect with a cluster I had to install a pip, URL-quote the password, construct the URL, and use lpad init -u to construct my_launchpad.yaml.

prismofeverything commented 4 years ago

Right, understood. It looks like this is how Atlas works now. Relying on a free service (what was mlab) is always going to have this problem of changing out from under you. I think the most reliable setup would be to install mongo yourself somewhere, either running locally or in the case of sherlock where that's not possible, run it is a virtual server. It takes the initial mongo setup (which frankly looks easier than these steps), but then each subsequent workflow is just a new db in the same instance. It also makes it easier to explain how to set it up for others, as you can say "point at an existing mongo db instance" rather than walking them through setting up a free service/moving target. In fact, since fireworks is the one that requires mongodb, you can just point them at their docs here: https://materialsproject.github.io/fireworks/installation.html That way, everyone is responsible for their own mongodb access, which is what fireworks itself suggests.

1fish2 commented 4 years ago

Good points. This brings clarity. FireWorks should drop the obsolete mLab part and explain how to connect to Atlas, so I posted that on their forum.

Proposal: A doc link to that forum thread could fix this Issue if it works for @ggsun.

We could further update wholecell/fireworks/initialize.py to assemble an Atlas connection URI and a password into a working my_launchpad.yaml file, including URL-encoding.

We could add login auth to our GCE mongo server, although I tried once and the instructions didn't work. Setting up a mongo server elsewhere would require that work and more, including ongoing sysadmin and billing maintenance.

ggsun commented 4 years ago

@1fish2 This works great! Thank you so much for getting this to work. I agree that a link to the thread would be sufficient.