Closed montaguegabe closed 2 years ago
@montaguegabe wrt logging, You should be able to change the config log files (in conf/log/webserver.conf
, overrides https://github.com/e-mission/e-mission-server/blob/master/conf/log/webserver.conf.sample)
Hope that helps.
wrt resource usage, I run multiple containers on one AWS instance. The instances have different user counts. A distribution of the resource utilization (using docker stats
) is below.
CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS
8b2eab5b23b3 vail-stack_vail-web-server_1 0.74% 221.9MiB / 31.01GiB 0.70% 2.52GB / 2.3GB 10.6GB / 138MB 2.3
acaf01b32416 pc-stack_pc-web-server_1 0.84% 394.1MiB / 31.01GiB 1.24% 20.8GB / 18.5GB 44.4GB / 205MB 23
8914e02da2f4 cc-stack_cc-web-server_1 0.90% 696.2MiB / 31.01GiB 2.19% 18.4GB / 20.2GB 39.1GB / 205MB 23
c606785f6eb1 4c-stack_4c-web-server_1 0.97% 262.7MiB / 31.01GiB 0.83% 5.8GB / 3.65GB 17.1GB / 205MB 23
4d8020c60ce4 sc-stack_sc-web-server_1 0.79% 179.3MiB / 31.01GiB 0.56% 3.24GB / 3.26GB 16.1GB / 205MB 25
3661242acc75 fc-stack_fc-web-server_1 0.80% 193.4MiB / 31.01GiB 0.61% 7.52GB / 8.77GB 27.9GB / 205MB 23
674c919c4e5c stage-stack_stage-web-server_1 0.96% 400.4MiB / 31.01GiB 1.26% 9.16GB / 10.3GB 33GB / 173MB 23
d1a7d67c2822 prepilot-stack_prepilot-web-server_1 0.84% 532.2MiB / 31.01GiB 1.68% 7.94GB / 2.7GB 20.8GB / 94MB 23
Thanks Shankari! I also have to have the web app use not port 80 or 443, but whatever the runtime environment variable $PORT
is set to. Is that possible from within a config file right now?
If sed
is being used to set some of the environment variables it may not be working for some reason: I get this:
sed: -e expression #1, char 46: unknown option to `s'
{
"paths" : {
"static_path" : "webapp/www/",
"python_path" : "main",
"log_base_dir" : ".",
"log_file" : "debug.log"
},
"__comment" : "Fill this in for the production server. port will almost certainly be 80 or 443. For iOS, using 172.17.116.118 allows you to test without an internet connection. For AWS and android, make sure that the host 0.0.0.0, localhost does not seem to work",
"server" : {
"host" : "0.0.0.0",
"port" : "8080",
"__comment": "1 hour = 60 min = 60 * 60 sec",
"timeout" : "3600",
"auth": "skip",
"__comment": "Options are no_auth, user_only, never",
"aggregate_call_auth": "no_auth"
}
}
Live reload disabled,
Traceback (most recent call last):
File "emission/net/api/cfc_webapp.py", line 35, in <module>
import emission.net.api.visualize as visualize
File "/usr/src/app/e-mission-server/emission/net/api/visualize.py", line 13, in <module>
import emission.analysis.plotting.geojson.geojson_feature_converter as gfc
File "/usr/src/app/e-mission-server/emission/analysis/plotting/geojson/geojson_feature_converter.py", line 20, in <module>
import emission.storage.decorations.trip_queries as esdt
File "/usr/src/app/e-mission-server/emission/storage/decorations/trip_queries.py", line 15, in <module>
import emission.core.get_database as edb
File "/usr/src/app/e-mission-server/emission/core/get_database.py", line 19, in <module>
config_data = json.load(config_file)
File "/usr/src/app/miniconda-4.8.3/envs/emission/lib/python3.7/json/__init__.py", line 296, in load
parse_constant=parse_constant, object_pairs_hook=object_pairs_hook, **kw)
File "/usr/src/app/miniconda-4.8.3/envs/emission/lib/python3.7/json/__init__.py", line 348, in loads
return _default_decoder.decode(s)
File "/usr/src/app/miniconda-4.8.3/envs/emission/lib/python3.7/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/src/app/miniconda-4.8.3/envs/emission/lib/python3.7/json/decoder.py", line 355, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
Process exited with status 1
State changed from starting to crashed
I updated my webserver.conf with different values than those displayed above, but they do not appear to be in there
It may have something to do with mongodb connection string, which is printed on the line directly before the line about sed
:
mongodb+srv://usern4me_of_lettersnumbers_underscore:passw0rdoflettersnumbers@cluster0.z56eq.mongodb.net/myFirstDatabase?retryWrites=true&w=majority
sed: -e expression #1, char 46: unknown option to `s'
If I do not provide DB_HOST
, and instead push to Docker with a db.conf
file, then the error disappears, but it says the db.conf file is "overwritten" and instead looks for a MongoDB instance at some hard-coded IP
@montaguegabe we do in fact use sed to overwrite the DB host. I remember something about quoting the string properly for the sed to work...
Maybe it is #595?
^ it was that issue. Also mongodb+srv protocol is not supported so I downgraded the connection string format. And am now onto an error that says pymongo.errors.InvalidURI: MongoDB URI options are key=value pairs.
@montaguegabe aha! That's why issues are a good idea. #595 was reported by our internal cloud services team while setting up with documentDB, and it looks like we fixed it by switching to jq
. Is that what you did as well?
#set database URL using environment variable
echo ${DB_HOST}
if [ -z ${DB_HOST} ] ; then
local_host=`hostname -i`
jq --arg db_host "$local_host" '.timeseries.url = $db_host' conf/storage/db.conf.sample > conf/storage/db.conf
else
jq --arg db_host "$DB_HOST" '.timeseries.url = $db_host' conf/storage/db.conf.sample > conf/storage/db.conf
fi
cat conf/storage/db.conf
wrt:
And am now onto an error that says pymongo.errors.InvalidURI: MongoDB URI options are key=value pairs.
That error is from: https://github.com/mongodb/mongo-python-driver/blob/a0fe7c03af08adde0c893071e1664b43570b9841/pymongo/uri_parser.py#L337
which is the part that parses the options part (e.g. retryWrites=true&w=majority
).
Can you share what the options part looks like after the rewrite?
@montaguegabe aha! That's why issues are a good idea. #595 was reported by our internal cloud services team while setting up with documentDB, and it looks like we fixed it by switching to
jq
. Is that what you did as well?#set database URL using environment variable echo ${DB_HOST} if [ -z ${DB_HOST} ] ; then local_host=`hostname -i` jq --arg db_host "$local_host" '.timeseries.url = $db_host' conf/storage/db.conf.sample > conf/storage/db.conf else jq --arg db_host "$DB_HOST" '.timeseries.url = $db_host' conf/storage/db.conf.sample > conf/storage/db.conf fi cat conf/storage/db.conf
I just changed the DB username to not have an underscore rather than switching to JQ
wrt:
And am now onto an error that says pymongo.errors.InvalidURI: MongoDB URI options are key=value pairs.
That error is from: https://github.com/mongodb/mongo-python-driver/blob/a0fe7c03af08adde0c893071e1664b43570b9841/pymongo/uri_parser.py#L337
which is the part that parses the options part (e.g.
retryWrites=true&w=majority
). Can you share what the options part looks like after the rewrite?
So I saw that too but I'm still not sure why the URL fails to parse. MongoDB.com gives me two different formats. There is one for Python with MongoDB driver 3.4 and later: mongodb://<username>:<password>@cluster0-shard-00-00.z56eq.mongodb.net:27017,cluster0-shard-00-01.z56eq.mongodb.net:27017,cluster0-shard-00-02.z56eq.mongodb.net:27017/myFirstDatabase?ssl=true&replicaSet=atlas-a7u6ma-shard-0&authSource=admin&retryWrites=true&w=majority
And there is the invalid one that uses +srv
for driver 3.6 or later: mongodb+srv://<username>:<password>@cluster0.z56eq.mongodb.net/myFirstDatabase?retryWrites=true&w=majority
The latter one tells me that +srv is not supported without an additional dns related library installed. The first one is what gives me the parsing error described
Do you print out the modified URL after the SED has finished running?
Because the URL that you listed above (before being put into the DB conf using sed) parses correctly.
$ python3
Python 3.7.10 | packaged by conda-forge | (default, Feb 19 2021, 15:59:12)
[Clang 11.0.1 ] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import pymongo
>>> pymongo.uri_parser.parse_uri("mongodb://<username>:<password>@cluster0-shard-00-00.z56eq.mongodb.net:27017,cluster0-shard-00-01.z56eq.mongodb.net:27017,cluster0-shard-00-02.z56eq.mongodb.net:27017/myFirstDatabase?ssl=true&replicaSet=atlas-a7u6ma-shard-0&authSource=admin&retryWrites=true&w=majority")
{'nodelist': [('cluster0-shard-00-00.z56eq.mongodb.net', 27017), ('cluster0-shard-00-01.z56eq.mongodb.net', 27017), ('cluster0-shard-00-02.z56eq.mongodb.net', 27017)], 'username': '<username>', 'password': '<password>', 'database': 'myFirstDatabase', 'collection': None, 'options': {'ssl': True, 'replicaSet': 'atlas-a7u6ma-shard-0', 'authSource': 'admin', 'retryWrites': True, 'w': 'majority'}, 'fqdn': None}
>>>
Something in the sed
command is still breaking with your URL, I think.
I think I was using Docker CLI ENV=VALUE
format which splits on equals - if this is the issue then that's a silly mistake on my end. But waiting to see..
@shankari I think the above wasn't the issue (wishful thinking) – it is still broken. Any tips on how I can most easily change and debug the sed
command? It is baked into FROM emission/e-mission-server.dev.server-only:2.9.1
is it not? How do I rebuild that container so I can change that line?
you can just create a new version of start_script.sh
https://github.com/e-mission/e-mission-docker/blob/master/start_script.sh and then change the Dockerfile
for the webapp directory to copy the modified version.
e.g. something like
ADD start_script.sh /start_script.sh
RUN chmod u+x /start_script.sh
COPY start_script.sh /usr/src/app/start_script.sh
CMD ["/bin/bash", "/usr/src/app/start_script.sh"]
Thank you! I ended up rebuilding the Dockerfile that it draws from (fewer steps to go wrong) and removing the sed commands entirely for now. That causes it to work. However it still crashes due to No such file or directory: 'conf/net/ext_service/habitica.json'
That error is a warning, not a crash. https://github.com/e-mission/e-mission-docs/search?q=habitica.json&type=issues
Will run again and see if that was really the error..
Heroku I think may treat errors as warnings, but I just put a habitca.json in there to get rid of it. Now the error is that Heroku only waits so long for a port to be bound to, and all the conda installing makes it time out.
Heroku allows longer build times with this link: https://tools.heroku.support/limits/boot_timeout
I have yet to see if this works..
To anyone else looking at this issue thread, Heroku can actually do all the conda installs and set the server up in time if you set the slider to the maximum value of 180 seconds.
However I must have disabled sed for the port and host because it is not binding correctly (says localhost in the logs).
I haven't set the environment variable WEB_SERVER_HOST
- maybe that is the problem..
I deleted the sed
commands surrounding WEB_SERVER_HOST as well. And am on to
File "emission/net/api/cfc_webapp.py", line 70, in <module>
aggregate_call_auth = config_data["server"]["aggregate_call_auth"]
KeyError: 'aggregate_call_auth'
I added that key from the conf file. Now I am onto the main problem I knew I would face: Heroku apps must bind to a port based on the $PORT environment variable as per https://devcenter.heroku.com/articles/runtime-principles#web-servers
I have created https://github.com/montaguegabe/e-mission-server that overrides the config value and just uses the environment variable. Will update the docker repo to use this repository instead for cloning..
I am now getting a message that says "Listening on http://mm-e-mission.herokuapp.com/:6691/" <- the colon looks like it ought not to be there. Subsequently it fails to create the socket with
2022-03-29T18:21:47.361284+00:00 app[web.1]: File "/usr/src/app/miniconda-4.8.3/envs/emission/lib/python3.7/site-packages/cheroot/server.py", line 1772, in prepare
2022-03-29T18:21:47.361284+00:00 app[web.1]: raise socket.error(msg)
2022-03-29T18:21:47.361284+00:00 app[web.1]: OSError: No socket could be created -- (('mm-e-mission.herokuapp.com/', 6691): [Errno -2] Name or service not known)
Deleting the trailing slash from the hostname worked, but it appears that it still can't bind. I found this example https://github.com/mispy-archive/cherrypy-heroku-example/blob/master/app.py which appears to be related to CheRoot where they use 0.0.0.0 as the host to bind to. Will try that out
It works! Thanks for the help Shankari!
To summarize my findings:
sed
commented out:Before going on to the analysis server I am going to try to relax the memory requirement
Thanks @montaguegabe looking at this, some changes to the startup script that would make this be easier would be:
Is there anything else that the docker container could do better?
Both of those first two sound good! The third one is actually currently not an issue as they are separate parameters currently - the problem was me ending the hostname in a slash there. The config files provide more structure so some people may like them better, but certainly environment variables from the Heroku perspective simplify things.
The thing that would be super helpful is to not have to wait 3-5 minutes for the Docker container to build while debugging this. This would also help with uptime in production settings (if a node fails you have to wait 3-5 minutes until a new node is ready to replace it). I'm not an expert in Docker, so I don't know exactly how to implement this, but right now the conda installation happens when the containers are run, but it probably should happen when the containers are built.
@montaguegabe wrt "not have to wait 3-5 minutes for the Docker container to build while debugging this.", there is such an image (https://hub.docker.com/repository/docker/emission/e-mission-server). Note that the image we are now using is "dev" (FROM emission/e-mission-server.dev.server-only
).
It's just that the e-mission-server
image is not kept up-to-date since I don't have a CI pipeline set up for it. Yet. and it is not high enough priority to focus on it compared to the other fit and finish changes.
So if I understand you correctly, “dev” is the one that installs conda dependencies upon running, and the plain/vanilla Dockerfile installs them upon building?
I'm basically looking for something where I can run the docker image and the only code that is triggered is code that runs the server (not installation code)
@montaguegabe yes, as you can see from the primary dockerfile https://github.com/e-mission/e-mission-docker/blob/master/Dockerfile, the e-mission-server image installs the dependencies at build instead of at runtime.
Hi Shankari, unfortunately when I go to that directory with the Dockerfile and build the image, it not only tries to install the conda dependencies but also tries to run the server in start_script.sh. I comment out the line that runs the server and move it to the start_script.sh of webapp in multi-tier-cronjob, but then I get stuff about conda not being found on the PATH. Any recommended solution for this situation? Thanks very much!
@montaguegabe your original requirement was:
I'm basically looking for something where I can run the docker image and the only code that is triggered is code that runs the server (not installation code)
That is exactly what the image does - it bundles all the dependencies into the image and when you run it, it only runs the server. Not sure what the problem is with
it not only tries to install the conda dependencies but also tries to run the server in start_script.sh
@shankari I just tried on a fresh git clone in case I had messed something up. For me at least, I go to the root of the e-mission-docker
repo and run docker build -t gabeistesting -f Dockerfile .
(-f Dockerfile is just to make sure that the vanilla Dockerfile is used). The command then fails with the following:
> [15/15] RUN ["/bin/bash", "/start_script.sh"]:
#20 0.638 cat: conf/net/api/webserver.conf: No such file or directory
#20 0.640 Live reload disabled,
#20 33.21 storage not configured, falling back to sample, default configuration
#20 33.21 Connecting to database URL localhost
#20 33.21 analysis.debug.conf.json not configured, falling back to sample, default configuration
#20 33.71 Traceback (most recent call last):
#20 33.71 File "emission/net/api/cfc_webapp.py", line 35, in <module>
#20 33.71 import emission.net.api.visualize as visualize
#20 33.71 File "/usr/src/app/e-mission-server/emission/net/api/visualize.py", line 28, in <module>
#20 33.71 import emission.storage.timeseries.aggregate_timeseries as estag
#20 33.71 File "/usr/src/app/e-mission-server/emission/storage/timeseries/aggregate_timeseries.py", line 15, in <module>
#20 33.71 import emission.storage.timeseries.builtin_timeseries as bits
#20 33.71 File "/usr/src/app/e-mission-server/emission/storage/timeseries/builtin_timeseries.py", line 19, in <module>
#20 33.71 esta.EntryType.DATA_TYPE: edb.get_timeseries_db(),
#20 33.71 File "/usr/src/app/e-mission-server/emission/core/get_database.py", line 161, in get_timeseries_db
#20 33.71 TimeSeries.create_index([("user_id", pymongo.HASHED)])
#20 33.71 File "/root/miniconda-4.8.3/envs/emission/lib/python3.7/site-packages/pymongo/collection.py", line 2059, in create_index
#20 33.71 return self.__create_indexes([index], session, **cmd_options)[0]
#20 33.71 File "/root/miniconda-4.8.3/envs/emission/lib/python3.7/site-packages/pymongo/collection.py", line 1919, in __create_indexes
#20 33.71 with self._socket_for_writes(session) as sock_info:
#20 33.71 File "/root/miniconda-4.8.3/envs/emission/lib/python3.7/site-packages/pymongo/collection.py", line 198, in _socket_for_writes
#20 33.71 return self.__database.client._socket_for_writes(session)
#20 33.71 File "/root/miniconda-4.8.3/envs/emission/lib/python3.7/site-packages/pymongo/mongo_client.py", line 1293, in _socket_for_writes
#20 33.71 server = self._select_server(writable_server_selector, session)
#20 33.71 File "/root/miniconda-4.8.3/envs/emission/lib/python3.7/site-packages/pymongo/mongo_client.py", line 1278, in _select_server
#20 33.71 server = topology.select_server(server_selector)
#20 33.71 File "/root/miniconda-4.8.3/envs/emission/lib/python3.7/site-packages/pymongo/topology.py", line 243, in select_server
#20 33.71 address))
#20 33.71 File "/root/miniconda-4.8.3/envs/emission/lib/python3.7/site-packages/pymongo/topology.py", line 200, in select_servers
#20 33.71 selector, server_timeout, address)
#20 33.71 File "/root/miniconda-4.8.3/envs/emission/lib/python3.7/site-packages/pymongo/topology.py", line 217, in _select_servers_loop
#20 33.71 (self._error_message(selector), timeout, self.description))
#20 33.71 pymongo.errors.ServerSelectionTimeoutError: localhost:27017: [Errno 111] Connection refused, Timeout: 30s, Topology Description: <TopologyDescription id: 6244a5016c2a6bdb415bb86f, topology_type: Single, servers: [<ServerDescription ('localhost', 27017) server_type: Unknown, rtt: None, error=AutoReconnect('localhost:27017: [Errno 111] Connection refused')>]>
------
executor failed running [/bin/bash /start_script.sh]: exit code: 1
To me, this looks like the build step is trying to run the server, not just set up the image. This is important for us because on Heroku it is the difference between $500/month to pay for the expensive remote instances that can handle the installation steps and $0/month free trial (although of course we will scale up from this)
Actually after updating my Docker it works fine.. sorry about that - am investigating. Thanks again for your help!
Just an update:
I think I didn't really do a "fresh clone" like I said before. With that solved, I can now build the vanilla Dockerfile, and then reference it from the Dockerfile in the multitier-cronjob example, and build that one too. When I go to actually run the final result, it is running the command conda env update --name emission --file setup/environment36.yml
, and I see "Requirement already up-to-date", meaning that the image already has the dependencies installed (like you said). However it seems that just calling conda env update --name emission --file setup/environment36.yml
is enough to consume over 1 GB of memory. For our case, I will have to investigate a way to not have that command called when the image is run.
@montaguegabe Phew! I thought I was in for an intense bout of troubleshooting.
The dockerfiles in the multitier-cronjob use a custom start_script.sh
which first clones the server.
If you remove that (since the server is already installed), you will not redo the setup steps
https://github.com/e-mission/e-mission-docker/blob/master/examples/em-server-multi-tier-cronjob/webapp/start_script.sh#L2
Hi Shankari, as far as I can tell, the conda issue is still there. I get an error when running the image:
setup/activate_conda.sh: line 6: /usr/src/app/miniconda-4.8.3/etc/profile.d/conda.sh: No such file or directory
This is because the Dockerfile is running build CMDs as root by default, and so I can see it installs miniconda to /root/miniconda-4.8.3/
. However, when activate_conda.sh
is called as part of running the image, for some reason I gather that $HOME is set to /usr/src/app
, so it looks for miniconda there and doesn't find it
May be related to this from Heroku's documentation: "We strongly recommend testing images locally as a non-root user, as containers are not run with root privileges on Heroku."
Yes that is probably it. There are some examples of building the docker container as a non-root user (https://stackoverflow.com/questions/67261873/building-docker-image-as-non-root-user).
Basically, you either need to:
/usr/src/app
/root/
I don't think you need to change users otherwise since we only use $HOME
to configure the miniconda location.
@montaguegabe did you get this to work?
Hi Shankari! Yes it worked for the webapp! I just had to make sure the conf was being copied to where python was expecting it to be. I just have to set up analysis now..
I'm assuming to be able to fill out push.json
I will need to deploy the frontend first
well, you don't technically have to deploy the frontend, but you do need to create a firebase account And you can't test it until you build the frontend, so you might as well get that to work first :)
@montaguegabe any updates on this? want to make sure that there isn't anything pending that I don't know about.
Hello,
I will be documenting my process of attempting to deploy e-mission on Heroku here.
What I have learned so far:
Heroku recommends Docker over their buildpack system for deployment of any Anaconda projects
I'll be using the Heroku container registry
First clone the docker repo:
git clone https://github.com/e-mission/e-mission-docker.git
Then
cd
to the place with the web-app Dockerfile:examples/em-server-multi-tier-cronjob/webapp/
.Then follow instructions at
https://devcenter.heroku.com/articles/container-registry-and-runtime#getting-started
to deploy the Dockerfile. The first time you run the commands they will fail because by default, the dynos do not have enough memory. The process of installing all the conda packages requires a little over 1 GB of memory, so you will need to run a dedicated Performance M dyno with 2.5 GB of memory (at least for this step).Another thing preventing the instructions from succeeding will be configuration of the MongoDB database. I will use a MongoDB database from mongodb.com for now, which makes it easy to provision a DB and get connection parameters to it. These connection parameters should be joined into a connection URL and set in the Heroku dashboard (
App -> Settings -> Reveal Config Vars
)Still to figure out
/tmp/
so I believe the default log location may trigger an exception. However to see the logs, Heroku requires them to be logged to stdout (not any file). Then they become visible toheroku logs -a my-app-name --tail
and other logging addons.