erlyaws / yaws

Yaws webserver
https://erlyaws.github.io
BSD 3-Clause "New" or "Revised" License
1.29k stars 267 forks source link

Improve finding of private ports #294

Closed weisslj closed 7 years ago

weisslj commented 7 years ago

Since df1d7bf Yaws does not really open port 0 anymore, but tries to open one from 49152 onwards. This leads to "badbind" errors when e.g. executing multiple separate common test runs which use Yaws repeatedly.

With this patch port 0 is opened instead, the port number retrieved, and closed again. This still is not completely safe, as the port could be taken by another application in the short time Yaws tries to listen on it again, but in practice I could not reproduce badbind errors.

This approach leaves all advantages of df1d7bf, but also respects the local range for dynamic ports if specified, see http://stackoverflow.com/a/924337.

Here is a test program which illustrates the problem. With this patch it runs continuously, without it stops with "badbind":

#!/bin/bash

NUM_CT_CONCURRENT=4

trap "exit" INT TERM
trap "kill 0" EXIT

sim_testcase () {
    echo "ct run $1, testcase $2"
    output=`erl -pa ebin -noshell \
        -eval 'yaws:start_embedded("www/testdir", [{port, 0}], [], "'$1'")' \
        -eval 'timer:sleep(1000)' \
        -eval 'yaws:stop()' \
        -eval 'init:stop()' \
        2>&1`
    retval=$?
    if [[ $retval -ne 0 ]] ; then
        case $output in
            *badbind*) echo "---> BADBIND <---" ;;
            *) echo "$output" ;;
        esac
    fi
    return $retval
}

sim_ct_run () {
    n=$(($2+1))
    if sim_testcase $1 $n ; then
        sim_ct_run $1 $n
    else
        kill 0
    fi
}

set -m
(
    for tag in `seq 1 $NUM_CT_CONCURRENT` ; do
        sim_ct_run $tag 0 &
    done
    wait
)
weisslj commented 7 years ago

Please disregard this pull request until I have figured out why the test case failed. I will push a fix to this branch as soon as possible.

capflam commented 7 years ago

Hi,

The error is a bug with the testsuite. It happens on Travis only, because VMs can be really slow probably. I don't found time to work on it for now. But this is unrelated with your patch. I guess that by restarting the job, it will pass.

weisslj commented 7 years ago

Hi @capflam,

I also believe that it is a bug with the testsuite, I am currently trying to find it. Without Travis about 2 out of 100 runs of make test trigger badbind.

capflam commented 7 years ago

Hi @weisslj,

I've finally merged your pull request. I had no time to find the bug in the testsuite. It's annoying but not critical. And your pull request was good enough to be merged. Thanks.

weisslj commented 7 years ago

OK, thanks! I spent some time after my comment to find out why the test suite fails, without success.