zulip / docker-zulip

Container configurations, images, and examples for Zulip.
https://zulip.com/
Apache License 2.0
583 stars 241 forks source link

No django package after 1.4.2-2 #75

Closed xeor closed 7 years ago

xeor commented 7 years ago

I am upgrading zulip pre 1.4.2-2 and suddenly django imports started failing. Backtracking, I found that this started failing between v1.4.2-2 and v1.4.2-3. Digging deeper I found this; https://github.com/zulip/zulip/blob/54acbc41ed60dd0d5efee60b134d447420921a46/scripts/lib/install#L43-L45, which will never execute after https://github.com/galexrt/docker-zulip/commit/9ab4d924bfd1cd798763b694ad680c3e3eb79f11.

I'm not able to find anything useful about this change? But for me, it seams like everyone is getting this to work post 1.4.2-3 as well. What am I doing wrong? Even starting with the example docker-compose.yml, gives django import errors.

docker run -it --rm quay.io/galexrt/zulip:1.4.2-2 bash

root@7227199e12e9:/# find . -name django
./srv/zulip-venv-cache/f83cfe968f17a63ae607dc950346927cbb6dc820/zulip-venv/lib/python2.7/site-packages/django

root@7227199e12e9:/# ls -a /root/
.  ..  .bashrc  .cache  .distlib  .profile  zulip

root@7227199e12e9:/# ls /var/log/zulip/
django.log                          events-feedback_messages.log                   events-user-activity-interval.log
events-confirmation-emails.log      events-message_sender.log                      events-user-activity.log
events-deliver_enqueued_emails.log  events-missedmessage_mobile_notifications.log  events-user-presence.log
events-digest_emails.log            events-missedmessage_reminders.log             install.log
events-email_mirror.log             events-signups.log                             queue_error
events-error_reports.log            events-slow_queries.log                        tornado.log

docker run -it --rm quay.io/galexrt/zulip:1.4.2-3 bash

root@7227199e12e9:/# find . -name django
root@995ce1c6eeea:/#

root@995ce1c6eeea:/# ls -a /root/
.  ..  .bashrc  .profile  zulip

root@995ce1c6eeea:/# ls -l /var/log/zulip/
total 40
-rw-r--r--. 1 zulip zulip 37179 Dec 30 13:40 install.log
drwxr-x---. 2 zulip zulip     6 Dec 30 13:40 queue_error
galexrt commented 7 years ago

@xeor Thanks for your report! I pushed a change to the dev branch. The image is still building, but when it has been built I look if Django is installed again. But I can't provide you with a backport of the fix for 1.4.x but there should be no problems when going straight to the latest version of Zulip 1.5.1 as Django should apply all database updates from "older" Zulip versions too.

galexrt commented 7 years ago

@xeor Djano is now installed again. Could you please retest with the dev image (quay.io/galexrt/zulip:dev, Docker Hub still needs some time building the image)? (Don't forget to make a backup before updating)

xeor commented 7 years ago

Thanks for the quick response!

/etc/zulip/uwsgi.ini is missing, might be some of my options making it skip the generation of it? I'm also getting error: <class 'xml.parsers.expat.ExpatError'>, not well-formed (invalid token): line 5, column 17: file: /usr/lib/python2.7/xmlrpclib.py line: 558 from supervisorctl tail zulip-postsetup-create_user. Might also be my config.

I'll test with a fresh setup (maybe tomorrow) if this doesn't mean anything. About to leave work soon :)

galexrt commented 7 years ago

Also thanks for the quick response! :+1: @xeor Regarding the uwsgi.ini missing, that happens when you mount the /etc/zulip directory as a volume. As in the lastest dev image the file exists.

# docker run -it --rm quay.io/galexrt/zulip:dev bash
root@c762c157de22:/# cd /etc/zulip/
root@c762c157de22:/etc/zulip# ls -ahl
total 36K
drwxr-xr-x 2 zulip zulip 4.0K Feb 20 17:51 .
drwxr-xr-x 1 root  root  4.0K Feb 20 18:25 ..
-rw-rw-r-- 1 zulip zulip  17K Feb  7 19:24 settings.py
-rw-r--r-- 1 root  root   252 Feb 20 17:51 uwsgi.ini
-rw-r--r-- 1 root  root   115 Feb 20 17:50 zulip.conf
root@c762c157de22:/etc/zulip#
xeor commented 7 years ago

I only mount /data:rw and a settings.py that I replaces /etc/zulip/settings.py with in a post-setup.d script.. I'll check more tomorrow :thinking:

galexrt commented 7 years ago

@xeor I pushed multiple fixes to dev. The image is now fully working again for me. Please test so I can create a new release from it. Thanks!

xeor commented 7 years ago

Testing now using:

After around 5 minutes of running, some services are still strugling;

root@4f2803a310e3:/# supervisorctl status | grep -v RUNNING
zulip-django                                                    FATAL     Exited too quickly (process log may have details)
zulip-postsetup-create_user                                     EXITED    Feb 21 09:29 AM
zulip-workers:zulip-deliver-enqueued-emails                     STARTING
root@4f2803a310e3:/# supervisorctl tail zulip-django
realpath() of /etc/zulip/uwsgi.ini failed: No such file or directory [core/utils.c line 3616]
realpath() of /etc/zulip/uwsgi.ini failed: No such file or directory [core/utils.c line 3616]

root@4f2803a310e3:/# supervisorctl tail zulip-postsetup-create_user
error: <class 'xml.parsers.expat.ExpatError'>, not well-formed (invalid token): line 7, column 0: file: /usr/lib/python2.7/xmlrpclib.py line: 558

root@4f2803a310e3:/# supervisorctl tail zulip-workers:zulip-deliver-enqueued-emails
7-02-20-20-00-49/zerver/management/commands/deliver_email.py", line 94, in handle
    if not send_email_job(job):
....
  File "/usr/lib/python2.7/socket.py", line 571, in create_connection
    raise err
socket.error: [Errno 111] Connection refused

This is the same errors I got as before your previous comments.

root@4f2803a310e3:/# find / -name uwsgi.ini
root@4f2803a310e3:/#

Not sure why I don't have uwsgi.ini. It is defined in https://github.com/zulip/zulip/blob/42cbce9ad678c2786dc7bbf68661c611137ea7e5/puppet/zulip/manifests/app_frontend_base.pp#L73-L81, and I got other files defined by the same manifest. Like /usr/lib/nagios/plugins/zulip_app_frontend and /etc/nginx/zulip-include/app.d/

It should be around;

root@4f2803a310e3:~# grep uwsgi.ini /var/log/zulip/install.log
Notice: /Stage[main]/Zulip::App_frontend_base/File[/etc/zulip/uwsgi.ini]/ensure: defined content as '{md5}096627db48f8c286eb68cdb9a09aaab8'

hmm... Come to think about it. I do mount in /data, and /etc/zulip symlinks to /data/settings/etc-zulip. If this is added during the image build process, it will be gone when I mount /data (which you kinda said already :man_facepalming:

I'll start it with mounting /data as /data_new so I can copy over the files. I will also let my tiny debug history stay as comment so maybe others can learn :)

galexrt commented 7 years ago

@xeor Thanks for your extensive debugging of the problem! Yes that seems to be the problem. You can try running the image with just bash and copying the uswg.ini out and in the /data/settings/etc-zulip you should be fine then. I think that other startup issues come from the missing uswg.ini.

I'll add a notice to README.md about the uswg.ini. As said thanks for the extensive information and debugging! I'm closing this issue, when I've added the notice in the REAME.md. If errors with the startup of the services persist, please reopen the issue.

xeor commented 7 years ago

Anytime :) Can just as easy use gitlab comment field for taking notes for things like this. Do you know when the 1.5.1 image will work?

Thanks for updating the README.md

galexrt commented 7 years ago

@xeor I will create a new release named 1.5.1-1 for this huge fix. I keep you updated when I have created the version tag and the image has been built.

galexrt commented 7 years ago

Tag created https://github.com/galexrt/docker-zulip/releases/tag/1.5.1-1. Now just waiting for the Quay.io build to finish. https://quay.io/repository/galexrt/zulip?tab=builds

xeor commented 7 years ago

Just to get it out there;

I had another problem going from 1.4 to 1.5. I got a 500 error when trying to login. I am using AD authentication and a custom realm. It looks like the migration from 1.4 to 1.5 added another realm alias. Deleting this extra realm alias fixed the 500 error during login.

I got a an error-mail from django telling me:

...
  File "./zerver/models.py", line 319, in get_realm_by_email_domain
    alias = RealmAlias.objects.select_related('realm').get(domain = email_to_domain(email))
  File "/home/zulip/deployments/2017-02-21-11-09-06/zulip-venv/lib/python2.7/site-packages/django/db/models/query.py", line 389, in get
    (self.model._meta.object_name, num)
MultipleObjectsReturned: get() returned more than one RealmAlias -- it returned 2!
...
# Login, su to 'zulip'
zulip@71c8c6909222:~/deployments/current$ ./manage.py shell
In [1]: from zerver.models import RealmAlias

In [2]: RealmAlias.objects.all()
Out[2]: <QuerySet [<RealmAlias: RealmAlias object>, <RealmAlias: RealmAlias object>, <RealmAlias: RealmAlias object>]>

In [3]: RealmAlias.objects.all()[0]
Out[3]: <RealmAlias: RealmAlias object>

In [4]: RealmAlias.objects.all()[0].realm
Out[4]: <Realm: zulip.com 1>

In [5]: RealmAlias.objects.all()[1].realm
Out[5]: <Realm: our-domain.no 2>

In [6]: RealmAlias.objects.all()[2].realm
Out[6]: <Realm: our-domain.no@acme.com 3>

In [7]: RealmAlias.objects.all()[1].domain
Out[7]: u'our-domain.no'

In [8]: RealmAlias.objects.all()[2].domain
Out[8]: u'our-domain.no'

In [9]: RealmAlias.objects.all()[2].delete()
Out[9]: (1, {u'zerver.RealmAlias': 1})

# problem solved
galexrt commented 7 years ago

@xeor That's weird. This could be related to the createZulipUser.sh script the container executes. I would recommend you that you add the env var ZULIP_USER_CREATION_ENABLED with value false to your docker-compose.yml to disable the creation of a realm and the given user details. But this could also just be related to the changes made to the Zulip database schema.

xeor commented 7 years ago

Thanks.. I did that as well now in case it would have happened again if I recreated the containers :)