metaeducation / rebol-issues

6 stars 1 forks source link

Rebol crashes when opening the 128th port #1422

Open rebolbot opened 14 years ago

rebolbot commented 14 years ago

Submitted by: PeterWood

I ran the "One-Line" TCP Scanner under R3 - it completed but reported all ports being open. (Only port 80 is open on Local Host). When running a second time, Rebol crashed with System Error 1412.

This is easily repeatable by launching R3 and running the TCP Scanner twice.

tcp://localhost:27
27 is open
tcp://localhost:28
REBOL System Error:
REBOL System Error #1412: REBOL System Error

Program terminated abnormally.
This should never happen.
Please contact www.REBOL.com with details.

CC - Data [ Version: alpha 95 Type: Bug Platform: Mac OSX Category: Ports Reproduce: Always Fixed-in:none ]

rebolbot commented 14 years ago

Submitted by: PeterWood

This problem does not occur with R3-A94 under Windows/XP

rebolbot commented 14 years ago

Submitted by: Carl

Networking is different in R3. See http://www.rebol.net/wiki/TCP_Port_Open_Issue

Open allocates the socket and binds TCP. You must call open again to open the actual connection.

So if you use:

  close open open probe join tcp://localhost: n
it will work.
rebolbot commented 14 years ago

Submitted by: PeterWood

Thanks. I'll try to take a proper look at using the double open and run a few tests.

I did a quick check on Ubuntu (libc6) using A96 and could duplicate the 1412 crash on opening the 128th TCP port.

rebolbot commented 14 years ago

Submitted by: Carl

There's a design problem. The OPEN fails, an error signal is sent form the host (external) to the kernel (internal), but it never gets processed for some reason (the code was designed to be asynchronous), and the event queue overflows. This needs to be fixed.

Code example:

repeat n 200 [
    if not error? try [
        close open open join tcp://localhost: n
    ][
        print ["port" n "is open"]
    ]
]
rebolbot commented 13 years ago

Submitted by: abolka

Testcase added to the test suite (1232e9b).

rebolbot commented 9 years ago

Submitted by: fork

The problem here is that a Lookup_Socket() runs, which invokes Signal_Device():

    https://github.com/rebol/rebol/blob/25033f897b2bd466068d7663563cd3ff64740b94/src/os/dev-net.c#L319

Signal_Device()--in turn--calls RL_Event()...which seems to think that calling Append_Event() will set some kind of signal to process the queued event:

    https://github.com/rebol/rebol/blob/25033f897b2bd466068d7663563cd3ff64740b94/src/core/a-lib.c#L490

However, no signal is set. And even if a port signal were set, the only thing that might react to it and prevent overflow in Do_Signals() is #ifdef'd out:

    https://github.com/rebol/rebol/blob/25033f897b2bd466068d7663563cd3ff64740b94/src/core/c-do.c#L783

In the Atronix build, the number of events just seems to have been bumped up:

    https://github.com/zsx/r3/blob/e3d845297ebe0ad5d2dc541d3385f1c740b06c12/src/core/p-event.c#L89

I'm testing my own workaround which sets a signal and then calls Awake_System() from Do_Signals() if it is set. I'm not clear on the ramifications, but thought I would mention this is not a case of error propagation - rather a misunderstanding by clients of Append_Event that a signal will be set.