bitwiseworks / libc

LIBC Next (kLIBC fork)
9 stars 4 forks source link

A file stream associated with a socket is not flushed on exit #139

Closed komh closed 4 months ago

komh commented 4 months ago

Hi/2.

A file stream fdopen()ed from a socket seems not to be flushed on exit, especially stdout. Because of this, buffered outputs are lost. However, this works correctly with a pipe.

Here is a testcase. If commenting out fflush( stdout ) line, a parent cannot receives any output from a child.

socketpair4.c.txt

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#include <process.h>

#include <io.h>
#include <fcntl.h>

#include <sys/socket.h>

#define STDIN   0
#define STDOUT  1
#define STDERR  2

#define CHILD_MAGIC "__CHILD_MAGIC__"

static int setcloexec( int fd )
{
    int ret;

    ret = fcntl( fd, F_GETFD );
    if( ret != -1 )
        ret = fcntl( fd, F_SETFD, ret | FD_CLOEXEC );

    return ret;
}

static int parent( int inwr, int outrd )
{
    FILE *fpin;
    FILE *fpout;
    char line[ 256 ];

    fpin = fdopen( inwr, "w");
    fprintf( fpin, "Hello");
    fclose( fpin );

    fpout = fdopen( outrd, "r");
    fgets( line, sizeof( line ), fpout );
    fclose( fpout );

    fprintf( stderr, "Parent: received: [%s]\n", line );

    return 0;
}

static int child( void )
{
    char line[ 256 ];

    fgets( line, sizeof( line ), stdin );
    fprintf( stderr, "Child: received: [%s]\n", line );

    printf("World!!!");

    /* Without fflush()ing, the above output is only buffered, and not written
     * to a socket
     */
    fflush( stdout );

    return 0;
}

int main( int argc, char *argv[])
{
    int socksin[ 2 ], socksout[ 2 ];
    int savedin, savedout;
    int ret;

    if( argc > 1 && strcmp( argv[ 1 ], CHILD_MAGIC ) == 0)
        return child();

    socketpair( AF_LOCAL, SOCK_STREAM, 0, socksin );
    socketpair( AF_LOCAL, SOCK_STREAM, 0, socksout );

    setcloexec( socksin[ 0 ]);
    setcloexec( socksin[ 1 ]);

    setcloexec( socksout[ 0 ]);
    setcloexec( socksout[ 1 ]);

    savedin = dup( STDIN );
    dup2( socksin[ 0 ], STDIN );

    savedout = dup( STDOUT );
    dup2( socksout[ 1 ], STDOUT );

    spawnl( P_NOWAIT, argv[ 0 ], argv[ 0 ], CHILD_MAGIC, NULL );

    dup2( savedout, STDOUT );
    close( savedout );

    dup2( savedin, STDIN );
    close( savedin );

    close( socksin[ 0 ]);
    close( socksout[ 1 ]);

    return parent( socksin[ 1 ], socksout[ 0 ]);
}
SilvanScherrer commented 4 months ago

sounds like https://github.com/bitwiseworks/libcx/issues/31 did you try with libcx added to the build stage?

komh commented 4 months ago

Ooooops, it's the same problem and the same solution. ^^ BTW, why didn't you implement this in LIBCn ?

SilvanScherrer commented 4 months ago

iirc because it was easier in libcx

komh commented 4 months ago

Then, do you have no plan to implement in LIBCn ? It's a more general solution.

komh commented 4 months ago

I've investigated this, and found the cause.

Here is the patch.

0001-startup-flush-streams-explicitly-in-_CRT_term.patch

From 971bef07ea65d2f23ec6824de020e260c9397f99 Mon Sep 17 00:00:00 2001
From: KO Myung-Hun <komh@chollian.net>
Date: Thu, 20 Jun 2024 00:02:58 +0900
Subject: [PATCH] startup: flush streams explicitly in _CRT_term()

This fixes that buffered data of streams associatd with a socket are
lost.

This problem occurs because _CRT_term() calls weak terminators only if
its reference count is zero. However, all kLIBC programs are linked to
libc DLL dynamically. As a result, the count is not zero even if a
program ends, because libc DLL is not terminated, yet. That is,
_CRT_term() is not called yet in its _DLL_InitTerm(). Calling order is
like:

    1. _CRT_init() in _DLL_InitTerm() of libc
    2. _CRT_init() in crt0.s of a program
    3. _CRT_term() in exit() of a program
    4. Exit list routines registered by DosExitList()
    5. _CRT_term() in _DLL_InitTerm() of libc

After all, a program closes all the opened sockets in exit list of TCPIP
on step 4, but buffered data of streams are not flushed yet, because
_CRT_term() does not call __crtexit1__ set vector at this time. Flushing
is done when libc DLL is terminated on step 5. However, sockets are not
valid any more. As a result, buffered data of streams are lost.

And both exit() and abort() should flush streams, and they both calls
_CRT_term(). Therefore, flushing in _CRT_term() is proper. In addition,
it is not harmful even if calling flushall() too many.

    modified:   src/lib/startup/startup.c
---
 src/emx/src/lib/startup/startup.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/src/emx/src/lib/startup/startup.c b/src/emx/src/lib/startup/startup.c
index 50ef590b..50396a74 100755
--- a/src/emx/src/lib/startup/startup.c
+++ b/src/emx/src/lib/startup/startup.c
@@ -93,6 +93,18 @@ void _CRT_term(void)
 {
     LIBCLOG_ENTER("\n");
     int32_t cRefs = __atomic_decrement_s32(&gcCRTReferences);
+
+    /*
+     * Flush streams here explicitly not by the __crtexit1__ set vector.
+     * Otherwise streams associated with a socket can lose their buffered data,
+     * because the below weak terminators including _exit_streams() are called
+     * when all the imported DLLs are terminated. That is, all the opened
+     * sockets are closed before being flushed. Therefore buffered data of
+     * streams are lost. And both exit() and abort() should flush streams, so
+     * here is the proper place to flush streams.
+     */
+    flushall();
+
     if (cRefs == 0)
     {
         /*
-- 
2.42.0
dmik commented 3 months ago

Great job on tracking it! It really makes sense to move it to LIBCn. Back then (in 2017) we preferred not to touch kLIBC because there was no LIBCn IIRC (we forked in 2018). I'll check and commit.

dmik commented 3 months ago

I've tested it all and seems to work good. The patch was lacking a include (a warning), so I had to commit a modified version of it. I also had to change LF → CRLF, please bear that in mind for future patches, all LIBC is CRLF for historical reasons.

dmik commented 3 months ago

BTW, I somehow have a feeling that this infamous git push problem (where it reports annoying errors requiring 10x reties before it succeeds, see https://github.com/bitwiseworks/git-os2/issues/7) has gone now. But it needs more testing of course. At least two pushes to LIBCn and LIBCx worked at the first attempt here now.

komh commented 3 months ago

Thanks!

StevenLevine commented 3 months ago

Hi all,

Should there be new builds in netlabs experimental for this change and the associated libcx change?

Thanks,

Steven

--

"Steven Levine" @.***> Warp/DIY/BlueLion etc. www.scoug.com www.arcanoae.com www.warpcave.com

SilvanScherrer commented 3 months ago

just stay tuned. it will happen very soon.