savonet / liquidsoap

Liquidsoap is a statically typed scripting general-purpose language with dedicated operators and backend for all thing media, streaming, file generation, automation, HTTP backend and more.
http://liquidsoap.info
GNU General Public License v2.0
1.4k stars 128 forks source link

Crash/hang when input.harbor is connected and SIGTERM received #2630

Open brenc opened 2 years ago

brenc commented 2 years ago

Describe the bug In development, I have ls starting up under nodemon which will send a SIGTERM when it detects a config change in order to restart it:

CMD ["nodemon", "-e", "liq,cert,key", "--signal", "SIGTERM", "--exec", "/usr/bin/liquidsoap", "--", "/app/test.liq"]

Normally this will cause ls to restart without issue. However, I've noticed that when input.harbor is connected (SSL or not) ls will crash with something like this:

[runtime:1] Uncaught error error(kind="http",message="CurlException(Aborted_by_callback, 42, CURLE_ABORTED_BY_CALLBACK)\nBacktrace:\nRaised by primitive operation at Curl.handle#perform in file \"curl.ml\", line 1295, characters 21-33\nCalled from Liqcurl.http_request in file \"tools/liqcurl.ml\", line 185, characters 4-22\nRe-raised at Liqcurl.http_request in file \"tools/liqcurl.ml\", line 216, characters 4-13\nCalled from Builtins_http.add_http_request.(fun) in file \"builtins/builtins_http.ml\", line 161, characters 12-196\n")!
[runtime:1] Raised by primitive operation at Curl.handle#perform in file "curl.ml", line 1295, characters 21-33
[runtime:1] Called from Liqcurl.http_request in file "tools/liqcurl.ml", line 185, characters 4-22
[runtime:1] Re-raised at Liqcurl.http_request in file "tools/liqcurl.ml", line 216, characters 4-13
[runtime:1] Called from Builtins_http.add_http_request.(fun) in file "builtins/builtins_http.ml", line 161, characters 12-196
[runtime:1] Re-raised at Runtime_error.error in file "tools/runtime_error.ml", line 42, characters 17-51
[runtime:1] Called from Builtins_http.add_http_request.(fun) in file "builtins/builtins_http.ml", line 184, characters 14-54
[runtime:1] Called from Evaluation.apply.f in file "lang/evaluation.ml", line 315, characters 8-12
[runtime:1] Re-raised at Evaluation.apply.f in file "lang/evaluation.ml", line 318, characters 10-130
[runtime:1] Called from Evaluation.eval in file "lang/evaluation.ml", line 239, characters 16-26
[runtime:1] Called from Evaluation.apply.f in file "lang/evaluation.ml", line 315, characters 8-12
[runtime:1] Re-raised at Evaluation.apply.f in file "lang/evaluation.ml", line 318, characters 10-130
[runtime:1] Called from Tutils.finalize in file "tools/tutils.ml", line 105, characters 12-16
[runtime:1] Re-raised at Tutils.finalize in file "tools/tutils.ml", line 111, characters 4-38
[runtime:1] Called from Evaluation.apply.f in file "lang/evaluation.ml", line 315, characters 8-12
[runtime:1] Re-raised at Evaluation.apply.f in file "lang/evaluation.ml", line 318, characters 10-130
[runtime:1] Called from Evaluation.apply.f in file "lang/evaluation.ml", line 315, characters 8-12
[runtime:1] Re-raised at Evaluation.apply.f in file "lang/evaluation.ml", line 318, characters 10-130
[runtime:1] Called from Evaluation.eval.ans.(fun) in file "lang/evaluation.ml", line 292, characters 46-56
[runtime:1] Called from Stdlib__List.map in file "list.ml", line 92, characters 20-23
[runtime:1] Called from Evaluation.eval.ans in file "lang/evaluation.ml", line 292, characters 18-60
[runtime:1] Called from Evaluation.apply.f in file "lang/evaluation.ml", line 315, characters 8-12
[runtime:1] Re-raised at Evaluation.apply.f in file "lang/evaluation.ml", line 318, characters 10-130
[runtime:1] Called from Evaluation.eval.ans.(fun) in file "lang/evaluation.ml", line 292, characters 46-56
[runtime:1] Called from Stdlib__List.map in file "list.ml", line 92, characters 20-23
[runtime:1] Called from Evaluation.eval.ans in file "lang/evaluation.ml", line 292, characters 18-60
[runtime:1] Called from Evaluation.eval.ans.(fun) in file "lang/evaluation.ml", line 292, characters 46-56
[runtime:1] Called from Stdlib__List.map in file "list.ml", line 92, characters 20-23
[runtime:1] Called from Evaluation.eval.ans in file "lang/evaluation.ml", line 292, characters 18-60
[runtime:1] Called from Evaluation.eval in file "lang/evaluation.ml", line 287, characters 15-27
[runtime:1] Called from Evaluation.apply.f in file "lang/evaluation.ml", line 315, characters 8-12
[runtime:1] Re-raised at Evaluation.apply.f in file "lang/evaluation.ml", line 318, characters 10-130
[runtime:1] Called from Tutils.finalize in file "tools/tutils.ml", line 105, characters 12-16
[runtime:1] Re-raised at Tutils.finalize in file "tools/tutils.ml", line 111, characters 4-38
[runtime:1] Called from Builtins_thread.(fun).f in file "builtins/builtins_thread.ml", line 77, characters 26-43

The old process never fully shuts down. The new process tries to start, but crashes with:

Error while trying to bind server/telnet socket: Address already in use in bind()

The client (butt in this case) stays connected and streaming. Eventually the logs are flooded with:

[input.harbor.ssl_0:3] Buffer overrun: Dropping 0.02s. Consider increasing the max buffer size!

To Reproduce Stripped down script:

settings.request.metadata_decoders.set([])

# Squelch useless ffmpeg errors.
settings.ffmpeg.log.verbosity.set('quiet')

settings.init.allow_root.set(false)

settings.log.file.set(false)
settings.log.level.set(3)
settings.log.stdout.set(true)

settings.request.leak_warning.set(200)

settings.harbor.ssl.private_key.set('/app/star_newgroundsradio_com.key')
settings.harbor.ssl.certificate.set('/app/star_newgroundsradio_com.cert')

settings.prometheus.server.set(true)
settings.prometheus.server.port.set(8002)

settings.server.telnet.set(true)
settings.server.telnet.port.set(8001)

%include "secrets.liq"

let ngr = ()
let ngr.inputs = ()

##
## INPUTS
##
let ngr.inputs.harbor_ssl = input.harbor.ssl(
  "radio",
  buffer=6.0,
  port=8000,
  timeout=10.0,
  user=icecast_mount_radio_usr,
  password=icecast_mount_radio_psw,
)

let ngr.inputs.harbor_ssl = normalize(
  debug=1.0,
  down=0.5,
  lookahead=2.0,
  # lufs=true,
  # target=-24.0,
  window=4.0,
  ngr.inputs.harbor_ssl
)

let ngr.inputs.harbor_ssl = blank.strip(max_blank=10.0, ngr.inputs.harbor_ssl)

##
## TRANSITIONS
##
let ngr.transitions = ()

def ngr.transitions.to_auto_dj(_, b)
  add(
    normalize=false,
    [
      once(single('/ngradio/jingles/piper_welcome_1.mp3')),

      fade.in(duration=6.0, type='sin', b),
    ]
  )
end

def ngr.transitions.to_live(a, b)
  add(
    normalize=false,
    [
      fade.out(duration=3.0, type='sin', a),

      fade.in(duration=3.0, type='sin', b),
    ]
  )
end

##
## PLAYLISTS
##
jingles = playlist(
  '/ngradio/jingles/jingles_electronic.m3u',
  mime_type='application/x-mpegURL',
  mode='randomize',
  reload=1,
  reload_mode='rounds',
)

chipstepnewhigh = playlist(
  '/ngradio/playlists/chipstep_newhigh.m3u',
  mime_type='application/x-mpegURL',
  mode='randomize',
  reload=86400,
)

chipstepnewlow = playlist(
  '/ngradio/playlists/chipstep_newlow.m3u',
  mime_type='application/x-mpegURL',
  mode='randomize',
  reload=86400,
)

chipstepoldhigh = playlist(
  '/ngradio/playlists/chipstep_oldhigh.m3u',
  mime_type='application/x-mpegURL',
  mode='randomize',
  reload=86400,
)

chipstepoldlow = playlist(
  '/ngradio/playlists/chipstep_oldlow.m3u',
  mime_type='application/x-mpegURL',
  mode='randomize',
  reload=86400,
)

chipstep = random(
  [
    chipstepnewlow,
    chipstepoldlow,
    chipstepoldhigh,
    chipstepnewhigh,
  ],
  weights = [
    1,
    1,
    10,
    20,
  ],
)

##
## Station Mixes
##
electronic = random(
  [
    chipstep,
  ],
  weights=[
    1,  # chipstep
  ],
)

##
## autodj
##
let ngr.autodj = random(
  [
    electronic,
  ],
  weights = [
    10, # electronic
  ],
)

let ngr.autodj = amplify(
  # Base multiplicative factor.
  1.0,
  ngr.autodj,
  override='replay_gain',
)

let ngr.autodj = blank.skip(ngr.autodj)

let ngr.autodj = fallback(
  [
    request.queue(id='request'),

    delay(900.0, jingles),

    ngr.autodj,
  ]
)

let ngr.autodj_last_metadata= ref([])
ngr.autodj.on_track(fun (m) -> ngr.autodj_last_metadata := m)

##
## RADIO
##
let ngr.radio = fallback(
  track_sensitive = false,
  transition_length = 10.0,
  transitions = [
    ngr.transitions.to_live,
    ngr.transitions.to_auto_dj,
  ],
  [
    ngr.inputs.harbor_ssl,
    ngr.autodj,
  ],
)

let ngr.radio = rms(ngr.radio)

let ngr.rms = ngr.radio.rms

let ngr.radio = lufs(ngr.radio)

let ngr.lufs = ngr.radio.lufs

let ngr.radio = bpm(ngr.radio)

let ngr.bpm = ngr.radio.bpm

let ngr.radio = mksafe(ngr.radio)

let ngr.radio = insert_metadata(ngr.radio)

##
## OUTPUTS
##
let ngr.outputs = ()

let ngr.outputs.mp3 = ffmpeg.encode.audio(
  %ffmpeg(
    %audio(
      b='128k',
      codec='libmp3lame',
      samplerate=44100,
    )
  ),
  ngr.radio
)

output.icecast(
  %ffmpeg(format='mp3', %audio.copy),
  ngr.outputs.mp3,
  host='192.168.128.14',
  mount='/radio.mp3',
  password=icecast_mount_radio_psw,
  protocol='http',
  port=8000,
  user=icecast_mount_radio_usr,
)

output.icecast(
  %ffmpeg(format='mp3', %audio.copy),
  ngr.outputs.mp3,
  host='192.168.128.15',
  mount='/radio.mp3',
  password=icecast_mount_radio_psw,
  protocol='http',
  port=8000,
  user=icecast_mount_radio_usr,
)

output.icecast(
  %ffmpeg(format='mp3', %audio.copy),
  ngr.outputs.mp3,
  host='192.168.128.16',
  mount='/radio.mp3',
  password=icecast_mount_radio_psw,
  protocol='http',
  port=8000,
  user=icecast_mount_radio_usr,
)

This also happens in production when ls crashes and input.harbor is connected. ls never fully shuts down and continuously floods the logs with Buffer overruns.

Expected behavior ls should shut down, disconnect all inputs, and restart cleanly.

Version details

brenc commented 2 years ago

By the way, I was experimenting with HLS and noticed that this does NOT happen when I comment out the icecast outputs. If I switch my output to this:

let ngr.outputs.aac_lofi = %ffmpeg(
  format="mp4",
  movflags="+dash+skip_sidx+skip_trailer+frag_custom",
  frag_duration=10,
  %audio(
    codec="aac",
    channels=2,
    ar=44100,
    b="32k",
  )
)

let ngr.outputs.aac_midfi = %ffmpeg(
  format="mp4",
  movflags="+dash+skip_sidx+skip_trailer+frag_custom",
  frag_duration=10,
  %audio(
    codec="aac",
    channels=2,
    ar=44100,
    b="64k",
  )
)

let ngr.outputs.aac_hifi = %ffmpeg(
  format="mp4",
  movflags="+dash+skip_sidx+skip_trailer+frag_custom",
  frag_duration=10,
  %audio(
    codec="aac",
    channels=2,
    ar=44100,
    b="128k",
  )
)

def ngr.outputs.hls_segment_name(~position, ~extname, stream_name)
  timestamp = int_of_float(time())
  duration = 2
  "#{stream_name}_#{duration}_#{timestamp}_#{position}.#{extname}"
end

output.file.hls(
  persist_at="state.config",
  playlist="radio.m3u8",
  segments=5,
  segment_duration=2.0,
  segments_overhead=5,
  segment_name=ngr.outputs.hls_segment_name,
  "/ngradio/hls/aac",
  [
    ("aac_lofi", ngr.outputs.aac_lofi),
    ("aac_midfi", ngr.outputs.aac_midfi),
    ("aac_hifi", ngr.outputs.aac_hifi),
  ],
  ngr.radio
)

Then connect harbor.input and restart (via sending SIGTERM), ls fully shuts down and restarts cleanly.

toots commented 2 years ago

Thanks for reporting. Could you try with this build once it finishes: https://github.com/savonet/liquidsoap/actions/runs/3094064733 ? The issue could be fixed with it.

brenc commented 2 years ago

Sure thing. Will this ever end up on Docker Hub?

toots commented 2 years ago

Yeah there were a bunch of different things that hit the CI but it's coming up right now: https://github.com/savonet/liquidsoap/actions/runs/3100710130

brenc commented 2 years ago

Just tried savonet/liquidsoap:rolling-release-v2.1.x and the issue is still there.

toots commented 1 year ago

Ha damn. Do you have any log of the issue with the latest rolling-release-v2.1.x?

brenc commented 1 year ago

Just tried it again with rolling-release-v2.1.x and it looks like #2584 is back, so I couldn't really test this (buffer empties, input keeps getting disconnected).

brenc commented 1 year ago

Something hit me while I was on a walk recently: I don't think I actually tried the latest rolling release. I don't often use images tagged like this so I totally forgot that you have to manually pull to get the latest image. This is probably why #2584 was still there.

I just pulled the latest rolling-release-v2.1.x and #2584 was definitely not there, but ls still crashes when harbor is connected and never fully shuts down.

toots commented 1 year ago

Thanks for testing! I'm gonna have another pass at this one!