Closed usanchez closed 5 years ago
Hi, sounds like you are running into end-deploy stage administration issues. There are several issues you have to resolve:
1) Now that you have deployed the cluster you need to deploy the code. Use awsebcli (the AWS Elastic Beanstalk Command Line Tool) to deploy code versions to the cluster. This requires some knowledge of AWS but setup is fairly easy, the CL Tools are a necessary for management of Elastic Beanstalk clusters in general.
2) You will then need an ssl cert set up on the Elastic Load Balancer. Beiwe absolutely requires an SSL certificate before it lets you do anything. (The Apps will not even connect to non-SSL endpoints.) SSL certs require a url and extra configuration that is not automated by the cluster deployment script.
3) The rewrite rule - the Beiwe server will redirect any request to https. Https uses port 443. Port 443 is not opened automatically on the load balancer's security groups. This is an oversight and a bug. Resolve 1 and 2 above (complex) and then resolve this (should be straight forward in the aws ec2 web console) and you should be up.
Thanks Eli,
I followed the steps in the Wiki, but I don't know if I missed anything:
I followed this tutorial to install awsebcli
, so I could use the eb deploy
command. Followed all the commands of the Set up the AWS Elastic Beanstalk Command-Line Interface (EB CLI) section of the wiki.
Following the Configuring SSL section, I got a ACM certificate and deployed the app (no change was required in 01.config
, because the code was already uncommented).
I don't know, but I think that I have already done that (correct me if I'm wrong). Load Balancers > Actions > Edit listeners > Change HTTP to HTTPS and attach the ACM certificate. Is it enough?
Should I SSH any instance to check if Apache is running or any other thing?
Thank you very much!!
if you do the eb ssh command it will ssh you into your elastic beanstalk server. Check /var/logs/httpd and check if your server is is hit when you go to the url.
If your url is not configured to point at your elastic load balancer you need to configure it to do so using an AWS Route53 alias.
Hi Eli,
I have something that is not correctly configured, because I cannot access the web yet. Ignoring the DNS part now, if I go to the IP address of the Load Balancer of the environment (sth like awseb-AWSEBLoa-ABCDEF123456-1234567890.us-east-1.elb.amazonaws.com), should I be able to get to the web?
The DNS seems to be configured correctly, both the Load Balancer DNS name and the selected domain name point to the same IP.
So, I don't what I am missing. Any service that I should look for that must be running? I get to the server but no content is shown.
Edit: I've compared the services running in a single server cluster compared to the scalable cluster, and I've seen that none of the EC2 instances of the scalable cluster are running any apache2 services. Should I start from here?
Thank you very much!
I've seen that the wsgi.conf (/etc/httpd/conf.d/wsgi.conf) file containts the following:
LoadModule wsgi_module modules/mod_wsgi.so
WSGIPythonHome /opt/python/run/baselinenv
WSGISocketPrefix run/wsgi
WSGIRestrictEmbedded On
<VirtualHost *:80>
Alias /static/ /opt/python/current/app/frontend/static/
<Directory /opt/python/current/app/frontend/static/>
Order allow,deny
Allow from all
</Directory>
Should the <VirtualHost *:80>
be <VirtualHost *:443>
for the HTTPS?
I've checked that I could obtain the webpage form inside the environment EC2 by doing curl 127.0.0.1
, so the problem must be with the access from the outside to the app.
In addition, I've seen that, in the Elastic Beanstalk console, if I go to Configuration and check the Network block, it's displayed that "This environment is not part of a VPC". Could this be a problem too?
EDIT: I checked the Wiki for any errors related to the VPC that I could have done. I know that I followed the procedure in the step Configure your application correctly, so there should be something else.
And so, I checked the source code that in rds.py
there is a TODO that says:
# attach the security group that will allow access
VpcSecurityGroupIds=[db_sec_grp_id],
#TODO: is this even relevant?
# providing the subnet is critical, not providing this value causes the db to be non-vpc
# DBSubnetGroupName='string',
I know that this is related to the RDS database, but my guess is that something similar happened to my environment. There must be a place or a step where the VPC was not assigned to the environment. Any clue?
EDIT2: I've been reading the AWS docs and I found this doc: Configuring Amazon Virtual Private Cloud (Amazon VPC) with Elastic Beanstalk. Here they mention a file called vpc.config
located in the .ebextensions
folder. Is this file missing in the repo? or should I create it?
1) I have no idea what that "This environment is not part of a VPC" error is. Almost every API call that the setup script makes to AWS requires declaring the VPC that it affects, and will fail with an error if it did not.
2) The whole non-vpc db TODO thing: This was a confusion during development, it is a non-issue. Non-VPC is a distinction about where the server is deployed logically within EC2. The TODO needs to be removed and the comment needs to be updated. Limited documentation is on https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_VPC.WorkingWithRDSInstanceinaVPC.html look for "non-vpc".
3) "Should the <VirtualHost :80> be <VirtualHost :443> for the HTTPS?" Nope, the load balancer reroutes connections on port 443 to port 80.
And you do not need a vpc config file.
Hi Eli.
Thank you very much for writing.
Sorry if I didn't explain it well. This is the block I am referring to: . The environment is running, but has no VPC assigned. That's what I meant, and I thought that it may have been the issue in my case.
I didn't see it as an issue, but I thought that the Elastic Beanstalk being non-vpc could have been an issue. Thank you for the additional information.
Okay, thank you.
I'll repeat the DNS and SSL steps once again to ensure I did it correctly.
Thank you very much.
I redid the steps, but had to change some settings to make it work. Can access the web now.
I’ve been trying to set up a scalable cluster following the steps in the deployment instructions, and I have encountered a problem that I think might be related to the one in this thread…
I have the cluster up and running, but I’m unable to access the frontend web portal.
As far as I can tell, the DNS settings are configured correctly in Route 53, and when I ping the subdomain I set up for this system, it’s associated with the same IP address as when I ping the DNS name associated with the load balancer (although both pings result in a series of timeouts). Also, when I check /var/log/httpd/access_log
on the load balancer, I’m able to see my attempts to access the system from different web browsers. Running curl 127.0.0.1
from the load balancer returns the html for the Beiwe frontend landing page.
I’ve also set up an SSL certificate for our subdomain using ACM and the HTTPS redirect code in .ebextensions/01.config
was enabled by default.
However, I’m still not actually able to access the Beiwe frontend from the internet.
I’ve been trying to figure out why this might be, and I’m wondering if it might be because the environment for Django seems not to be configured properly. If I look at /opt/python/bundle/5/app/config/settings.py
, most of the parameters (including DOMAIN_NAME
) are set based on environment variables, yet those environment variables do not seem to be defined on the load balancer (i.e. echo $DOMAIN_NAME
returns nothing). Relatedly, according to some of the documentation I’ve come across, it sounds like it might be necessary to define an ALLOWED_HOSTS
setting (e.g., https://docs.aws.amazon.com/elasticbeanstalk/latest/dg/create-deploy-python-django.html), which does not seem to be included.
I’m wondering if these missing Django settings might be the reason I’m not able to access the frontend. If so, does anyone have an idea why the necessary environment variables wouldn’t have gotten set on the load balancer even though the deployment appears to have gone smoothly otherwise? The header of /opt/python/bundle/5/app/config/settings.py
says To customize any of these values, append a line to config/remote_db_env.py
, but the scalable cluster deployment instructions do not indicate that any configuration of that file is necessary (although the single-server instructions do).
Is it necessary to change config/remote_db_env.py
for a scalable cluster deployment? If not, does anyone have any ideas about what the problem might be here, or what to try next to troubleshoot?
Thanks!
I have read through your issue here, not quite sure what it is you are hitting.
I should clarify something first: this server uses Flask for the server runtime, it only uses Django's ORM. Django settings are not going to do anything.
The remote_db_env file is for the manager/worker servers, the Elastic Beanstalk server uses environment variables. When imported/executed it just sets the appropriate environment variables.
The load balancer is not under Beiwe's control, it is an AWS router and it is configured through their web console.
My initial thought was that the security group is misconfigured (due to this project's general data privacy needs we require that SSL be properly configured, so the default setup blocks non-port 443 http calls) buuut you are saying you are seeing it hit your server, with no page loads.
I don't have a better Idea at this time, I will try and take s second whack at this later. Please let me know if it is a security group issue.
On Sat, Aug 10, 2019 at 1:26 PM er42 notifications@github.com wrote:8
I’ve been trying to set up a scalable cluster following the steps in the deployment instructions, and I have encountered a problem that I think might be related to the one in this thread…
I have the cluster up and running, but I’m unable to access the frontend web portal.
As far as I can tell, the DNS settings are configured correctly in Route 53, and when I ping the subdomain I set up for this system, it’s associated with the same IP address as when I ping the DNS name associated with the load balancer (although both pings result in a series of timeouts). Also, when I check /var/log/httpd/access_log on the load balancer, I’m able to see my attempts to access the system from different web browsers. Running curl 127.0.0.1 from the load balancer returns the html for the Beiwe frontend landing page.
I’ve also set up an SSL certificate for our subdomain using ACM and the HTTPS redirect code in .ebextensions/01.config was enabled by default.
However, I’m still not actually able to access the Beiwe frontend from the internet.
I’ve been trying to figure out why this might be, and I’m wondering if it might be because the environment for Django seems not to be configured properly. If I look at /opt/python/bundle/5/app/config/settings.py, most of the parameters (including DOMAIN_NAME) are set based on environment variables, yet those environment variables do not seem to be defined on the load balancer (i.e. echo $DOMAIN_NAME returns nothing). Relatedly, according to some of the documentation I’ve come across, it sounds like it might be necessary to define an ALLOWED_HOSTS setting (e.g., https://docs.aws.amazon.com/elasticbeanstalk/latest/dg/create-deploy-python-django.html), which does not seem to be included.
I’m wondering if these missing Django settings might be the reason I’m not able to access the frontend. If so, does anyone have an idea why the necessary environment variables wouldn’t have gotten set on the load balancer even though the deployment appears to have gone smoothly otherwise? The header of /opt/python/bundle/5/app/config/settings.py says To customize any of these values, append a line to config/remote_db_env.py, but the scalable cluster deployment instructions do not indicate that any configuration of that file is necessary (although the single-server instructions do).
Is it necessary to change config/remote_db_env.py for a scalable cluster deployment? If not, does anyone have any ideas about what the problem might be here, or what to try next to troubleshoot?
Thanks!
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/onnela-lab/beiwe-backend/issues/51?email_source=notifications&email_token=AANC6DVR5E7OLRRQUAKMEHLQD323BA5CNFSM4GE5VPGKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD4ARXFY#issuecomment-520166295, or mute the thread https://github.com/notifications/unsubscribe-auth/AANC6DQVFX4S2VRU435PFQDQD323BANCNFSM4GE5VPGA .
Thanks for your quick response!
I've checked the security group settings, and that doesn't appear to be the issue. The AWSEBLoadBalancerSecurityGroup
has HTTPS (port 443) enabled for both inbound and outbound connections. There's another security group (AWSEBSecurityGroup
) for the EB environment that has inbound HTTPS enabled, but only from the AWSEBLoadBalancerSecurityGroup
. On the load balancer itself, both HTTP and HTTPS listeners are enabled.
Does that all sound correct?
Thanks again for your help with this!
Are you hitting the exact error (the network box in the console says it is not part of a VPC) or just a similar lack of the website working?
Also, please check /var/log/httpd/errors.log for output, python stack traces or comments about necessary credentials are diagnostically useful. If there is a common failure mode at the end of this that we could detect I would like to know how.
The network box on the AWS console indicates that the load balancer is part of the AWSEBLoadBalancerSecurityGroup
with HTTPS enabled. However, when I go to the URL I've configured to point to the load balancer's DNS Name, the access attempt shows up in /var/log/httpd/access_log
, but I get an Error 408 in the browser.
Here's the tail end of var/log/httpd/error_log
(most of these messages are repeated various times throughout the log, but I don't see any others that stand out)...
[Thu Aug 08 01:17:23.530558 2019] [mpm_prefork:notice] [pid 32193] AH00169: caught SIGTERM, shutting down
[Thu Aug 08 01:17:24.609310 2019] [suexec:notice] [pid 25971] AH01232: suEXEC mechanism enabled (wrapper: /usr/sbin/suexec)
[Thu Aug 08 01:17:24.650473 2019] [so:warn] [pid 25971] AH01574: module wsgi_module is already loaded, skipping
[Thu Aug 08 01:17:24.655427 2019] [http2:warn] [pid 25971] AH10034: The mpm module (prefork.c) is not supported by mod_http2. The mpm determines how things are processed in your server. HTTP/2 has more demands in this regard and the currently selected mpm will just not do. This is an advisory warning. Your server will continue to work, but the HTTP/2 protocol will be inactive.
[Thu Aug 08 01:17:24.655439 2019] [http2:warn] [pid 25971] AH02951: mod_ssl does not seem to be enabled
[Thu Aug 08 01:17:24.655925 2019] [lbmethod_heartbeat:notice] [pid 25971] AH02282: No slotmem from mod_heartmonitor
[Thu Aug 08 01:17:24.655974 2019] [:warn] [pid 25971] mod_wsgi: Compiled for Python/2.7.13.
[Thu Aug 08 01:17:24.655978 2019] [:warn] [pid 25971] mod_wsgi: Runtime using Python/2.7.16.
[Thu Aug 08 01:17:24.675578 2019] [mpm_prefork:notice] [pid 25971] AH00163: Apache/2.4.39 (Amazon) mod_wsgi/3.5 Python/2.7.16 configured -- resuming normal operations
[Thu Aug 08 01:17:24.675599 2019] [core:notice] [pid 25971] AH00094: Command line: '/usr/sbin/httpd -D FOREGROUND'
[Sat Aug 10 15:05:22.343130 2019] [pagespeed:warn] [pid 14836] [mod_pagespeed 1.13.35.2-0 @14836] Failed to read cache clean timestamp /var/cache/mod_pagespeed/!clean!time!. Doing an extra cache clean to be safe.
The one that looks potentially relevant to me is mod_ssl does not seem to be enabled
, but I'm not sure what that means exactly.
Hm, no, that is the usual junk.
mod_ssl is the apache ssl plugin thing, the warning is irrelevant because the load balancer is the terminal end of the ssl connection. I doubt that is causing the issue.
I don't know why the compiled vs runtime python version lines show up, it has always been there and presumably it means the server code is in fact running....
408 is means timeout.... that's weird....
Can you check if the database / database shell is accessible? Do this:
cd /opt/python/current/
source ./env
cd app
python manage.py shell_plus
(Do the source operation exactly as written, with the ./)
That will load a python shell with the Django ORM loaded.
Do a simple database query like this:
ChunkRegistry.objects.count()
It should complete quickly and return 0.
I just tried the steps you suggested. It all worked and didn't throw any errors. ChunkRegistry.objects.count()
did indeed return 0, with no delay.
Well, I guess that is good, buuuut I don't know what is going on. 408 is a timeout, the request is hitting your server, but there is no evidence that it hits the python code.
Maybe check that the WSGIPath in software configuration in the deploy is "wsgi.py"?
Try commenting out the rewrite rule? You will need to deploy and then terminate the server and wait for EB to spin up the new one (the file put in place by the config file persists across individual vms),
I just went to the url you listed (http://awseb-awsebloa-abcdef123456-1234567890.us-east-1.elb.amazonaws.com/) and I'm getting DNS errors. DNS_PROBE_FINISHED_NXDOMAIN
It should be failing with an ssl cert invalid. Whatever else is going on you've also got dns issues.
Judging by its content (abc...123...), it appears that the URL posted by the other person who had an issue like this was meant for illustrative purposes only ;)
According to eb config, the WSGIPath is set to wsgi.py on my deployment.
I tried commenting out the rewrite rule, redeploying, and terminating the server, as you suggested. I'm still getting the same timeout error though.
Just to be clear, you meant to terminate the main EC2 instance, right? Or did you mean to terminate the load balancer and then retry?
(You did what I asked correctly)
I'm sorry but I'm a bit stumped and don't have further ideas at this point. I'll try and come back to this.
Ok, thanks! I'm glad to know I'm not the only one who's stumped. Thanks for your help in troubleshooting this. If you have any more ideas of things to try, please let me know.
Currently I am having he exact same issue and can't seem to get access to the web interface. Not sure what I am doing differently except perhaps not running on us-east-1 (ap-southeast-2).
**Update - actually I solved it by explictly installing mod24_ssl in the .ebextensions/01.config file:
packages: yum: gcc: [] postgresql96-devel: [] mod24_ssl: []
Okay, thank you for posting, that is... very odd. I will have to add that to the config, would happily accept a pull request against master on this.
I have been getting some random 'unable to locate wsgi.py' and had to direct to the full path but I am not sure if this is another issue.
What is Beanstalk doing! (つ•̀_•́)つ
There is some file path shuffling when you do an EB deploy operation, but if that's somehow failing ... I don't even know... kill that particular server instance because that behavior is truly garbage.
Hi,
I completed the deployment and set the DNS and SSL, but, when I introduced the new url in the navbar, I couldn't reach the destination. I checked the environment logs inside Elastic Beanstalk and I found this:
(repeated several times) and at the end of the log:
These lines were already uncommented in .ebextensions/01.config:
Additionally, in /var/log/eb-activity.log:
Any clue why i cannot access the webpage?