Closed Windyo closed 4 years ago
Mind to send me a link to a public conversation to <my github name>@nextcloud.com
?
Works very fine here.
Going in a meeting but I can do that in exactly 2 hours if that works for you
Anon
From: Joas Schilling notifications@github.com Sent: Wednesday, July 17, 2019 2:26:48 PM To: nextcloud/spreed Cc: G; Author Subject: Re: [nextcloud/spreed] Talk crashes the entire instance when doing public meetings. (#2010)
Mind to send me a link to a public conversation to
Works very fine here.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/nextcloud/spreed/issues/2010?email_source=notifications&email_token=AEI3N64TMSTJMTCZ526C7SLP74FYRA5CNFSM4IEPXECKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD2EAWUQ#issuecomment-512232274, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AEI3N66QASZSPVDDOILE6ADP74FYRANCNFSM4IEPXECA.
Ok, just sent the invite now.
As with previous cases, the moment that Talk is up, the entire instance slows down to a crawl. Loading any function is slow as hell. This does not affect other containers, and netdata reports no load:
I receive a "502 Bad Gateway". Can you check your apache2/nginx logs? There should be something somewhere.
Yeah that's the web container crashing... I'm checking logs but they're empty. Wondering if I'm looking in the wrong place
When exactly does it crash, when you make the conversation public, when you join the chat as a guest or when you start the call?
As soon as a user is in the call the web container starts getting sluggish which ends up with 502 errors.
We do long-polling requests (2 in parallel, 1 for chat messages and one for call related webrtc signaling messages). Maybe that is the problem?
I'm not intelligent enough to understand that answer unfortunately.
I'm still investigating the nginx logs but they seem to be completely empty on /var/log/ nginx/error.log, in both web and app containers. Which sounds rather weird.
basically every user has 2 constant connections open to your server. Maybe something in your configuration limits the number of possible open connections and therefore causes this problem.
I don't recall doing any specific setup, apart setting php-fpm to static and limiting it to 2 instances.
re: nginx: lrwxrwxrwx 1 root root 11 Jun 4 22:30 /var/log/nginx/access.log -> /dev/stdout
That would explain why I'm not getting any logs; but I didn't set that up at all - is that standard in the docker-nextcloud conf?
If someone's reading this later for some reason, yes it's normal. This allows you to watch errors direcly using docker, or portainer if you use that, without going to the file.
I just checked my nginx-conf but for me this looks OK re basic setup, I double checked against the official docker one:
error_log /var/log/nginx/error.log warn;
pid /var/run/nginx.pid;
events {
worker_connections 1024;
}
http {
include /etc/nginx/mime.types;
default_type application/octet-stream;
log_format main '$remote_addr - $remote_user [$time_local] "$request" '
'$status $body_bytes_sent "$http_referer" '
'"$http_user_agent" "$http_x_forwarded_for"';
access_log /var/log/nginx/access.log main;
sendfile on;
#tcp_nopush on;
keepalive_timeout 65;
set_real_ip_from 10.0.0.0/8;
set_real_ip_from 172.16.0.0/12;
set_real_ip_from 192.168.0.0/16;
real_ip_header X-Real-IP;
#gzip on;
upstream php-handler {
server app:9000;
}
server {
listen 80;
# Add headers to serve security related headers
# Before enabling Strict-Transport-Security headers please read into this
# topic first.
# add_header Strict-Transport-Security "max-age=15768000;
# includeSubDomains; preload;";
#
# WARNING: Only add the preload option once you read about
# the consequences in https://hstspreload.org/. This option
# will add the domain to a hardcoded list that is shipped
# in all major browsers and getting removed from this list
# could take several months.
add_header X-Content-Type-Options nosniff;
add_header X-XSS-Protection "1; mode=block";
add_header X-Robots-Tag none;
add_header X-Download-Options noopen;
add_header X-Permitted-Cross-Domain-Policies none;
add_header Referrer-Policy no-referrer;
root /var/www/html;
location = /robots.txt {
allow all;
log_not_found off;
access_log off;
}
# The following 2 rules are only needed for the user_webfinger app.
# Uncomment it if you're planning to use this app.
rewrite ^/.well-known/host-meta /public.php?service=host-meta last;
rewrite ^/.well-known/host-meta.json /public.php?service=host-meta-json last;
rewrite ^/.well-known/webfinger /public.php?service=webfinger last;
location = /.well-known/carddav {
return 301 $scheme://$host/remote.php/dav;
}
location = /.well-known/caldav {
return 301 $scheme://$host/remote.php/dav;
}
# set max upload size
client_max_body_size 10G;
fastcgi_buffers 64 4K;
# Enable gzip but do not remove ETag headers
gzip on;
gzip_vary on;
gzip_comp_level 4;
gzip_min_length 256;
gzip_proxied expired no-cache no-store private no_last_modified no_etag auth;
gzip_types application/atom+xml application/javascript application/json application/ld+json application/manifest+json application/rss+xml application/vnd.geo+json application/vnd.ms-fontobject application/x-font-ttf application/x-web-app-manifest+json application/xhtml+xml application/xml font/opentype image/bmp image/svg+xml image/x-icon text/cache-manifest text/css text/plain text/vcard text/vnd.rim.location.xloc text/vtt text/x-component text/x-cross-domain-policy;
# Uncomment if your server is build with the ngx_pagespeed module
# This module is currently not supported.
#pagespeed off;
location / {
rewrite ^ /index.php$request_uri;
}
location ~ ^/(?:build|tests|config|lib|3rdparty|templates|data)/ {
deny all;
}
location ~ ^/(?:\.|autotest|occ|issue|indie|db_|console) {
deny all;
}
location ~ ^/(?:index|remote|public|cron|core/ajax/update|status|ocs/v[12]|updater/.+|ocs-provider/.+)\.php(?:$|/) {
fastcgi_split_path_info ^(.+\.php)(/.*)$;
include fastcgi_params;
fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;
fastcgi_param PATH_INFO $fastcgi_path_info;
# fastcgi_param HTTPS on;
#Avoid sending the security headers twice
fastcgi_param modHeadersAvailable true;
fastcgi_param front_controller_active true;
fastcgi_pass php-handler;
fastcgi_intercept_errors on;
fastcgi_request_buffering off;
}
location ~ ^/(?:updater|ocs-provider)(?:$|/) {
try_files $uri/ =404;
index index.php;
}
# Adding the cache control header for js and css files
# Make sure it is BELOW the PHP block
location ~ \.(?:css|js|woff2?|svg|gif)$ {
try_files $uri /index.php$request_uri;
add_header Cache-Control "public, max-age=15778463";
# Add headers to serve security related headers (It is intended to
# have those duplicated to the ones above)
# Before enabling Strict-Transport-Security headers please read into
# this topic first.
# add_header Strict-Transport-Security "max-age=15768000;
# includeSubDomains; preload;";
#
# WARNING: Only add the preload option once you read about
# the consequences in https://hstspreload.org/. This option
# will add the domain to a hardcoded list that is shipped
# in all major browsers and getting removed from this list
# could take several months.
add_header X-Content-Type-Options nosniff;
add_header X-XSS-Protection "1; mode=block";
add_header X-Robots-Tag none;
add_header X-Download-Options noopen;
add_header X-Permitted-Cross-Domain-Policies none;
add_header Referrer-Policy no-referrer;
# Optional: Don't log access to assets
access_log off;
}
location ~ \.(?:png|html|ttf|ico|jpg|jpeg)$ {
try_files $uri /index.php$request_uri;
# Optional: Don't log access to other assets
access_log off;
}
}
}
Also checked the main proxy log to check if there was anything there, but acccording to it all requests get forwarded normally with no issue.
ok I now have logs for the proxy, the web container, the app container in front me of - ALL requests end up with a 200 status code. I'm at a loss as to where to look from here.
So running the test again with every logs active, I haave some sight. Seems the client is closing the conenction before crashing everything, but I still dono't see why
.0.4.20 - - [17/Jul/2019:22:43:31 +0000] "GET /ocs/v2.php/apps/spreed/api/v1/room/ainx9ydy HTTP/1.1" 404 79 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.142 Safari/537.36" "10.0.4.20"
10.0.4.20 - - [17/Jul/2019:22:43:31 +0000] "DELETE /ocs/v2.php/apps/spreed/api/v1/room/ainx9ydy/participants/active HTTP/1.1" 200 138 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.142 Safari/537.36" "10.0.4.20"
10.0.4.20 - - [17/Jul/2019:22:43:31 +0000] "GET /ocs/v2.php/apps/spreed/api/v1/room/ainx9ydy HTTP/1.1" 499 0 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.142 Safari/537.36" "10.0.4.20"
10.0.4.20 - - [17/Jul/2019:22:43:31 +0000] "GET /login HTTP/1.1" 302 0 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:69.0) Gecko/20100101 Firefox/69.0" "10.0.4.20"
10.0.4.20 - - [17/Jul/2019:22:43:31 +0000] "GET /apps/apporder/getOrder HTTP/1.1" 200 182 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:69.0) Gecko/20100101 Firefox/69.0" "10.0.4.20"
10.0.4.20 - - [17/Jul/2019:22:43:31 +0000] "GET /login HTTP/1.1" 302 0 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:69.0) Gecko/20100101 Firefox/69.0" "10.0.4.20"
10.0.4.20 - - [17/Jul/2019:22:43:31 +0000] "GET /avatar/Ombi/64 HTTP/1.1" 201 951 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:69.0) Gecko/20100101 Firefox/69.0" "10.0.4.20"
10.0.4.20 - - [17/Jul/2019:22:43:31 +0000] "GET /login HTTP/1.1" 302 0 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:69.0) Gecko/20100101 Firefox/69.0" "10.0.4.20"
10.0.4.20 - - [17/Jul/2019:22:43:31 +0000] "GET /login HTTP/1.1" 302 0 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:69.0) Gecko/20100101 Firefox/69.0" "10.0.4.20"
Ok I've even tried just putting the nextcloud container online online with no other load balancer or anything running - still fails. I'm 99% sure this has to do with docker, and I think what @nickvergessen wrote is logical - as it happens as soon as a call it started, it could be the long polls.
As is I don't know how to investigate any further and I am kinda stuck. I'll keep everything as-is for further testing and debug if someone can help.
Hi everyone. I got the same error. When i use 7.0 php the problem disapear
Which version were you using before?
@nickvergessen i used php 7.3 Our nextcloud is 15.0.10 and 15.0.5 and the problem is same. When we use php7.0 fpm the problem disapear
Ok I'm back - I'll try to do this next week. Just changing PHP version in the container was annoying so I'll clone into https://github.com/nextcloud/docker/tree/060cf0883ff12241081778714507e2823d84e629/16.0/fpm
change the reference to PHP7.0 then build the image from that. If anyone has concerns about this way of testing the issue let me know. Probably will do this tomorrow or smth.
@Windyo can you give me all your compose after that ? ;)
Is there an issue with using PHP 7.3 with Nextcloud talk? I'm on PHP 7.3 and I sometimes can't get reliable joins from people in a video call. Considering dropping back to 7.2 as a test.
@Windyo - PHP 5.6 and 7.0 are both end of life and no longer supported. I think you need 7.1 at a minimum, 7.2 probably better. (7.1 goes end of life in December 2019)
did it work for you too @tdm4 ?
@nickvergessen yes, downgrading to PHP 7.2 fixed my issues.. I think there's some problems with PHP 7.3.
@tdm4 for us we got obviously bad performances with 7.1 7.2 7.3 but we use fpm. Do you use fpm too ???
@tanguy-opendsi Yes, I use fpm too. It doesn't matter whether you use php-fpm or some kind of apache prefork.. it's PHP itself here. I just use php-fpm with ondemand
and max 10 children.
@tdm4 thx for your reply but, when i switch to fpm i got same troubles. You use apache2 with fpm ?
@tanguy-opendsi I don't use apache webserver.
@tdm4, Ok best regards.
quick update - I'm still trying to get the 7.2 container running with no errors. buliding the image from https://github.com/nextcloud/docker/tree/060cf0883ff12241081778714507e2823d84e629/16.0/fpm with the reference changed to 7.2 works, but then I get a slew of server errors complaining about wrong permissions for some reason.
Still haven't figured that out, so ATM I can't test if this fixes Talk.
Hi, I am getting similar issues wih public chats being very buggy and slow. I use Mail-in-a-Box ( https://mailinabox.email/ ), and modified it to install Nextcloud Talk (spreed).
# php -v
PHP 7.2.24-0ubuntu0.18.04.2 (cli) (built: Jan 13 2020 18:39:59) ( NTS )
Copyright (c) 1997-2018 The PHP Group
Zend Engine v3.2.0, Copyright (c) 1998-2018 Zend Technologies
with Zend OPcache v7.2.24-0ubuntu0.18.04.2, Copyright (c) 1999-2018, by Zend Technologies
# sudo -u www-data php /usr/local/lib/owncloud/occ -V
Nextcloud 15.0.8
# nginx -v
nginx version: nginx/1.14.0 (Ubuntu)
You can see the basic nginx configuration on their Github: https://github.com/mail-in-a-box/mailinabox/tree/master/conf but I think you are mostly looking at https://github.com/mail-in-a-box/mailinabox/blob/master/conf/nginx-primaryonly.conf
Well Nextcloud 15 is 3 major versions behind, have you considered doing an update?
It looks like Mail-in-a-Box hard coded Nextcloud version 17.0.2 with hash of 8095fb46e9e0c536163708aee3d17fab8b498ad6. I would like to propose a change, and there any concerns I might want to address with the project?
I ran this command, and got that it is up-to-date.
# sudo -u www-data php /usr/local/lib/owncloud/occ upgrade
Nextcloud is already latest version
# sudo -u www-data php /usr/local/lib/owncloud/occ -V
Nextcloud 15.0.8
Well that updates the database based on the current files. try:
sudo -u www-data php /usr/local/lib/owncloud/updater/updater.phar
Hey er i still have the original issue though.
Get Outlook for Androidhttps://aka.ms/ghei36
From: Joas Schilling notifications@github.com Sent: Wednesday, February 12, 2020 9:32:58 AM To: nextcloud/spreed spreed@noreply.github.com Cc: G w_i_n_d_y_o@hotmail.com; Mention mention@noreply.github.com Subject: Re: [nextcloud/spreed] Talk crashes the entire instance when doing public meetings. (#2010)
Closed #2010https://github.com/nextcloud/spreed/issues/2010.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/nextcloud/spreed/issues/2010?email_source=notifications&email_token=AEI3N6ZT6WLRBSSPVHMNPZ3RCOX3VA5CNFSM4IEPXECKYY3PNVWWK3TUL52HS4DFWZEXG43VMVCXMZLOORHG65DJMZUWGYLUNFXW5KTDN5WW2ZLOORPWSZGOWSLVHEQ#event-3029816210, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AEI3N63LYXHIVN6SPDLGJSLRCOX3VANCNFSM4IEPXECA.
I guess so, but we can't fix a misconfigured php/webserver setup in our software. Very sorry about this, but at teh same time there is still https://github.com/nextcloud/spreed/issues/2211 open and I guess that is what you face aswell
On one side i'm100% with you that you can't fix a misconfigured server, on the other side it's a docker instance, so config shouldn't really be an issue, esp as I've tested with a brand new config...
I'll check the HTTP2 thing out, but I don't think that's it as in this case the entire container crashes and restarts.
Hi,
I was having a similar issue and I solved it.
Starting a call makes php extremly slow. Once the conversion is closed and php restarted, everything is back to normal.
After looking at php logs I saw that max_children was reached. After raising it (from 5 to 20, 10 was not enough) and restarting php7.3-fpm, everything is working again.
Steps to reproduce
Expected behaviour
Calls happen
Actual behaviour
Entire instance becomes ultra-slow on any operation. Restarting App container gets the performance back to normal. Netdata does not report any high CPU usage, iowait, or network issue. The intance just borks and ends up throwing a 502 error.
Browser
All
Microphone available: yes Camera available: yes Operating system: Windows Browser name: All
Spreed app
Spreed app version: 6.0.2
Custom TURN server configured: no Custom STUN server configured: no
Server configuration
Operating system: Ubuntu Web server: Apache-fpm Database: MySQL PHP version: 7.3 Nextcloud Version: 16.0.3
working in a docker-container setup
List of activated apps:
server config:
Server log (data/nextcloud.log)