faryon93 / hlswatch

keep track of hls viewer stats
GNU General Public License v3.0
49 stars 9 forks source link

fatal error: concurrent map iteration and map write #3

Closed nicomagliaro closed 6 years ago

nicomagliaro commented 6 years ago

Hi, I think I found a glitch but I couldn't figured out how to fix it. My set up: I dockerized the nginx from this repo but changed the hlswatch port to 3010 cause was binding the grafana port i've dockerized too. Also I dockerized an InfluxDB

hlswatch.conf

[common]
listen = ":3010"
hls_path = "/tmp/hls"
viewer_timeout = 15

[influx]
address = "http://127.0.0.1:8086"
database = "hlswatch"
user = "hlswatch"
password = "hlswatch"

nginx.conf

worker_processes auto;

events {
    worker_connections 1024;
}
rtmp_auto_push on;
rtmp_auto_push_reconnect 1s;
rtmp_socket_dir /tmp/;
rtmp {
    server {
        listen 1935;
        chunk_size 512;
        application stream {
            live on;
            hls on;
            hls_fragment_naming system;
            hls_fragment 5s;
            hls_path /tmp/hls;
            hls_nested on;
            hls_cleanup on;
        }
    }
}
http {
    sendfile on;
    tcp_nopush on;
    directio 512;
    default_type application/octet-stream;
    upstream live1 {
      server 127.0.0.1:8080;
      keepalive 32;
    }
    server {
        listen 80;
        location ~ ^/(stream)/.+(m3u8|ts)$ {
            proxy_pass http://live1;   
          # Disable cache
          add_header 'Cache-Control' 'no-cache';
          # CORS setup
            add_header X-Cache-Status $upstream_cache_status;
            add_header Cache-Control max-age=300;
              add_header 'Access-Control-Allow-Methods' 'GET, POST, OPTIONS';
          add_header 'Access-Control-Allow-Origin' "$http_origin" always;
          add_header 'Access-Control-Expose-Headers' 'Content-Length';
          types {
              application/dash+xml mpd;
              application/vnd.apple.mpegurl m3u8;
              video/mp2t ts;
          }
     proxy_hide_header 'Access-Control-Allow-Origin';
         proxy_hide_header 'Access-Control-Allow-Credentials';
         proxy_hide_header 'Access-Control-Allow-Methods';
         proxy_hide_header 'Access-Control-Allow-Headers';
         proxy_hide_header 'Access-Control-Max-Age';
         proxy_hide_header 'Cache-Control';

        }
    }
    server {
        listen 8080;
    location / {
                 proxy_pass http://127.0.0.1:3010;
    }
    }
}

After migrating my enviroment from TEST to PROD, the app start to fail (concurrent map iteration and map write) after a couple of minutes:

Mar 12 12:39:07 cnd2 journal: [httpd] 127.0.0.1 - hlswatch [12/Mar/2018:15:39:07 +0000] "POST /write?consistency=&db=hlswatch&precision=s&rp= HTTP/1.1" 204 0 "-" "InfluxDBClient" 8052a896-260b-11e8-81ff-000000000000 73463
Mar 12 12:39:07 cnd2 docker-compose: #033[33minfluxdb_1    |#033[0m [httpd] 127.0.0.1 - hlswatch [12/Mar/2018:15:39:07 +0000] "POST /write?consistency=&db=hlswatch&precision=s&rp= HTTP/1.1" 204 0 "-" "InfluxDBClient" 8052a896-260b-11e8-81ff-000000000000 73463
Mar 12 12:39:07 cnd2 docker-compose: #033[36minfluxdb_1    |#033[0m [httpd] 127.0.0.1 - hlswatch [12/Mar/2018:15:39:07 +0000] "POST /write?consistency=&db=hlswatch&precision=s&rp= HTTP/1.1" 204 0 "-" "InfluxDBClient" 8052a896-260b-11e8-81ff-000000000000 73463
Mar 12 12:39:08 cnd2 journal: [httpd] 127.0.0.1 - hlswatch [12/Mar/2018:15:39:08 +0000] "POST /write?consistency=&db=hlswatch&precision=s&rp= HTTP/1.1" 204 0 "-" "InfluxDBClient" 80f6b22a-260b-11e8-8200-000000000000 82002
Mar 12 12:39:08 cnd2 docker-compose: #033[33minfluxdb_1    |#033[0m [httpd] 127.0.0.1 - hlswatch [12/Mar/2018:15:39:08 +0000] "POST /write?consistency=&db=hlswatch&precision=s&rp= HTTP/1.1" 204 0 "-" "InfluxDBClient" 80f6b22a-260b-11e8-8200-000000000000 82002
Mar 12 12:39:08 cnd2 docker-compose: #033[36minfluxdb_1    |#033[0m [httpd] 127.0.0.1 - hlswatch [12/Mar/2018:15:39:08 +0000] "POST /write?consistency=&db=hlswatch&precision=s&rp= HTTP/1.1" 204 0 "-" "InfluxDBClient" 80f6b22a-260b-11e8-8200-000000000000 82002
**Mar 12 12:39:09 cnd2 journal: fatal error: concurrent map iteration and map write
Mar 12 12:39:09 cnd2 journal: 
Mar 12 12:39:09 cnd2 journal: goroutine 6 [running]:
Mar 12 12:39:09 cnd2 journal: runtime.throw(0x716b5a, 0x26)
Mar 12 12:39:09 cnd2 journal: #011/home/maxi/.gvm/gos/go1.8/src/runtime/panic.go:596 +0x95 fp=0xc42046bb20 sp=0xc42046bb00
Mar 12 12:39:09 cnd2 journal: runtime.mapiternext(0xc42046bc00)
Mar 12 12:39:09 cnd2 journal: #011/home/maxi/.gvm/gos/go1.8/src/runtime/hashmap.go:737 +0x7ee fp=0xc42046bbd0 sp=0xc42046bb20
Mar 12 12:39:09 cnd2 journal: github.com/faryon93/hlswatch/state.(*Stream).GetCurrentViewers(0xc4200d5160, 0x37e11d600, 0xc42046bf68)
Mar 12 12:39:09 cnd2 journal: #011/home/maxi/go/hlswatch/.bowler/src/github.com/faryon93/hlswatch/state/stream.go:55 +0x137 fp=0xc42046bc70 sp=0xc42046bbd0
Mar 12 12:39:09 cnd2 journal: main.InfluxMetrics(0xc4200d44e0)
Mar 12 12:39:09 cnd2 journal: #011/home/maxi/go/hlswatch/.bowler/src/github.com/faryon93/hlswatch/influx.go:100 +0x2f5 fp=0xc42046bfd8 sp=0xc42046bc70
Mar 12 12:39:09 cnd2 journal: runtime.goexit()
Mar 12 12:39:09 cnd2 journal: #011/home/maxi/.gvm/gos/go1.8/src/runtime/asm_amd64.s:2197 +0x1 fp=0xc42046bfe0 sp=0xc42046bfd8
Mar 12 12:39:09 cnd2 journal: created by main.main
Mar 12 12:39:09 cnd2 journal: #011/home/maxi/go/hlswatch/.bowler/src/github.com/faryon93/hlswatch/main.go:114 +0x4c6
Mar 12 12:39:09 cnd2 journal: 
Mar 12 12:39:09 cnd2 journal: goroutine 1 [select, 7 minutes]:
Mar 12 12:39:09 cnd2 journal: main.wait(0xc4200f7f48, 0x3, 0x3)
Mar 12 12:39:09 cnd2 journal: #011/home/maxi/go/hlswatch/.bowler/src/github.com/faryon93/hlswatch/main.go:138 +0x140
Mar 12 12:39:09 cnd2 journal: main.main()
Mar 12 12:39:09 cnd2 journal: #011/home/maxi/go/hlswatch/.bowler/src/github.com/faryon93/hlswatch/main.go:118 +0x590
Mar 12 12:39:09 cnd2 journal: 
Mar 12 12:39:09 cnd2 journal: goroutine 4 [syscall, 7 minutes]:
Mar 12 12:39:09 cnd2 journal: os/signal.signal_recv(0x0)
Mar 12 12:39:09 cnd2 journal: #011/home/maxi/.gvm/gos/go1.8/src/runtime/sigqueue.go:116 +0x104
Mar 12 12:39:09 cnd2 journal: os/signal.loop()
Mar 12 12:39:09 cnd2 journal: #011/home/maxi/.gvm/gos/go1.8/src/os/signal/signal_unix.go:22 +0x22
Mar 12 12:39:09 cnd2 journal: created by os/signal.init.1
Mar 12 12:39:09 cnd2 journal: #011/home/maxi/.gvm/gos/go1.8/src/os/signal/signal_unix.go:28 +0x41
Mar 12 12:39:09 cnd2 journal: 
Mar 12 12:39:09 cnd2 journal: goroutine 5 [IO wait]:
Mar 12 12:39:09 cnd2 journal: net.runtime_pollWait(0x7fa0dae4ff00, 0x72, 0x0)
Mar 12 12:39:09 cnd2 journal: #011/home/maxi/.gvm/gos/go1.8/src/runtime/netpoll.go:164 +0x59
Mar 12 12:39:09 cnd2 journal: net.(*pollDesc).wait(0xc4201ac068, 0x72, 0x0, 0xc420384160)
Mar 12 12:39:09 cnd2 journal: #011/home/maxi/.gvm/gos/go1.8/src/net/fd_poll_runtime.go:75 +0x38
Mar 12 12:39:09 cnd2 journal: net.(*pollDesc).waitRead(0xc4201ac068, 0xffffffffffffffff, 0x0)
Mar 12 12:39:09 cnd2 journal: #011/home/maxi/.gvm/gos/go1.8/src/net/fd_poll_runtime.go:80 +0x34
Mar 12 12:39:09 cnd2 journal: net.(*netFD).accept(0xc4201ac000, 0x0, 0x842540, 0xc420384160)
Mar 12 12:39:09 cnd2 journal: #011/home/maxi/.gvm/gos/go1.8/src/net/fd_unix.go:430 +0x1e5
Mar 12 12:39:09 cnd2 journal: net.(*TCPListener).accept(0xc4201c6000, 0xc420214300, 0x6af1a0,** 0xffffffffffffffff)

It's any kind of threshold to setup for tuning the concurrence?

Br,

faryon93 commented 6 years ago

@nicomagliaro sorry for my late response! I must have missed the github notification :( I will have a look tonight.

faryon93 commented 6 years ago

I just pushed a new version which fixes this bug. I messed up synchronization. You can use faryon93/hlswatch as your base image for faster testing.