Closed masinter closed 1 year ago
Submitted PR interlisp/maiko#128 to address the d_reclen
issue.
i spent some time with gdb documentation and stepping through main in gdb but couldn't get by the execvp Changes to compile in cygwin3 branch
You appear to have merged your cygwin3 branch into master without telling anyone?
See https://jeffkreeftmeijer.com/git-rebase/ for an explanation of what you probably wanted to do (git rebase, to re-apply your local branch changes over a copy of the updated master branch, without merging your experiment back into master)
time for me to go back through the git tutorials
The current state of this is that it crashes on master
, right?
Do you have a stack trace?
So, that's stack corruption it seems ...
Looking at the code, we see something that seems dangerous:
* sys_size is sysout size in megabytes */
int sysout_loader(char * sysout_file_name, int sys_size)
{
int sysout; /* SysoutFile descriptor */
IFPAGE ifpage; /* IFPAGE */
char *fptovp_scratch; /* scratch area for FPTOVP */
#ifdef BIGVM
/* 1 "cell" per page for 256MB in 512 byte pages */
unsigned int fptovp[0x80000]; /* FPTOVP */
#else
DLword fptovp[0x10000]; /* FPTOVP */
#endif /* BIGVM */
long fptovp_offset; /* FPTOVP start offset */
That's quite a decently sized stack allocation there for FPTOVP
.
You could try telling Cygwin to give it a bigger stack size via adding this to the gcc command line:
-Wl,--stack,33554432
Alternatively, you could malloc
that and make sure it gets freed in the exit path (or just turn it into a malloc
and leak it to see if that makes the crash go away...).
Now it compiles ok and doesn't crash and runs the opcode tests in test.vm, but doesn't recognize keypress or mouse clicks (same problem as with running under WSL1)
Which fix did you apply to prevent the crash?
I take back what I said:
I think I just compiled it with gcc instead of clang. I wasn't sure now to apply either.
I also added -Wl,--stack,33554432
to the gcc line
yes, fixes segv on startup. Now it's just hanging because it doesn't get mouse clicks or keystrokes
Good news. Presumably that's after recompiling without the 32MB stack option. I'm trying to come up with a principled way of getting the X events processed.
Steve reported having problems with running on WSL2 in his configuration -- screen comes up but no characters recognized. Is it possible to fork a little process to act as a keystroke forwarder that waits for the keyboard and sends events whenever it gets a keystroke or mouse move?
We could use some documentation on what WSL2 (or cygwin) supports for polled or interrupt driven I/O, and then we can try to fix it. I don't have access to a WSL2 system (or one with cygwin), so there's not much I can do, or suggestions I can make to those who do, without some input...
Hello,
cygwin apparently does not support asynchronous i/o or a virtual timer, as the callback routines int_io_service
resp. int_timer_service
in timer.c
are never called, so for example when pressing the right mouse button, the menu appears but then the system is stuck. (networking does not work either, as incoming packets are not noticed)
As work-around in my dodo-nethub-support branch of Maiko, i added simple count-down mechanism to the instruction dispatcher (dispatch
in xc.c
): the compile variable CYGWIN_TIMER_ASYNC_EMULATION_INSNS_COUNTDOWN
defines the interval as number of instructions for simulating an external i/o interrupt, so the interrupt checking machinery is fired in more or less regular intervals.
This variable is set in makefile-cygwin.x86_64-x
(and only there) to 20000. To prevent compile errors, enabling async i/o in ether.c
for the Nethub socket is made dependent of the presence of O_ASYNC
(which is missing in Cygwin)
So i have now a working version of Maiko running natively on Windows10 as Cygwin executable which can do networking with Dodo XNS services and uses a local X11 server (VcXsrv 1.20.14.0). The Maiko process behaves normally in my understanding, the overall CPU load of the system is around 1-3% when the Medley environment is idle and goes up to ~15% (100% on about 1 out of 8 logical CPUs) when the right mouse button is pressed in Medley. So even if i/o is checked all 20.000 instructions, only minimal CPU resources are consumed when the Medley system is idle and only one CPU goes to 100% while a context menu is shown.
However networking is a bit "lethargic" compared to async i/o on Linux, but it is usable. Maybe using a lower value for CYGWIN_TIMER_ASYNC_EMULATION_INSNS_COUNTDOWN
could make it more snappy, but this value must be adjusted to the local hardware anyway...
Greetings, Hans
The configuration information that is specific to all instances of an OS like cygwin should be incorporated in the inc/maiko/platform.h -- we have carefully removed most of the system specific defines from the makefile fragments and I'd rather not move backwards there, if you don't mind.
Since there are likely to be other systems where simulating the async timer would be useful, I would suggest that the #define not be CYGWIN_... but rather MAIKO_EMULATE_TIMER_INTERRUPTS (or similar) and look at how to integrate it should other systems need it.
Oddly, cygwin defines FASYNC rather than O_ASYNC -- not sure what's going on there.
Can you check if current cygwin supports ITIMER_REAL rather than ITIMER_VIRTUAL — I don’t have any Windows boxes to check this on, but chatter from about 17 years ago suggests that starting with cygwin version 1.5.16 it supported the realtime timer.
On Oct 1, 2022, at 2:13 PM, devhawala @.***> wrote:
Hello,
cygwin apparently does not support asynchronous i/o or a virtual timer, as the callback routines int_io_service resp. int_timer_service in timer.c are never called, so for example when pressing the right mouse button, the menu appears but then the system is stuck. (networking does not work either, as incoming packets are not noticed)
As work-around in my dodo-nethub-support https://github.com/devhawala/maiko/tree/dodo-nethub-support branch of Maiko, i added simple count-down mechanism to the instruction dispatcher (dispatch in xc.c): the compile variable CYGWIN_TIMER_ASYNC_EMULATION_INSNS_COUNTDOWN defines the interval as number of instructions for simulating an external i/o interrupt, so the interrupt checking machinery is fired in more or less regular intervals.
This variable is set in makefile-cygwin.x86_64-x (and only there) to 20000. To prevent compile errors, enabling async i/o in ether.c for the Nethub socket is made dependent of the presence of O_ASYNC (which is missing in Cygwin)
So i have now a working version of Maiko running natively on Windows10 as Cygwin executable which can do networking with Dodo XNS services and uses a local X11 server (VcXsrv 1.20.14.0). The Maiko process behaves normally in my understanding, the overall CPU load of the system is around 1-3% when the Medley environment is idle and goes up to ~15% (100% on about 1 out of 8 logical CPUs) when the right mouse button is pressed in Medley. So even if i/o is checked all 20.000 instructions, only minimal CPU resources are consumed when the Medley system is idle and only one CPU goes to 100% while a context menu is shown.
However networking is a bit "lethargic" compared to async i/o on Linux, but it is usable. Maybe using a lower value for CYGWIN_TIMER_ASYNC_EMULATION_INSNS_COUNTDOWN could make it more snappy, but this value must be adjusted to the local hardware anyway...
Greetings, Hans
— Reply to this email directly, view it on GitHub https://github.com/Interlisp/medley/issues/103#issuecomment-1264482821, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB6DAWMY5GHPKW3ARV4FFZLWBCSPDANCNFSM4VF54H7A. You are receiving this because you commented.
Here some intermediate results:
ITIMER_REAL
for setitimer
breaks the program startup: the connection to the Nethub is opened, the main X-window opens and the UI is shown, but then the programs dies very quickly without further information (no console output at all); i tried it with in normal mode as well as elevated mode (i.e. with or without admin rights, the Windows equivalent to running as root or with setuid
), so no idea what's wrongITIMER_REAL
cannot replace ITIMER_VIRTUAL
(may have worked up to Windows-XP resp. before Windows-Vista)FASYNC
is present in the cygwin libs, but does not help, incoming TCP/IP packets keep being unnoticedMeaning the work-around simulating timer and i/o interrupts will be needed for the cygwin version.
So next things will be incorporating the Addr68k/NativeAlignment{24} changes and moving cygwin-specific dependencies to platform.h
I suspect that running with ITIMER_REAL may be causing an interrupt before the rest of the system is ready for it. I'll check out the situation with ITIMER_REAL on other systems and see if it's debuggable there.
I also notice that looks as though you're making the ether fd async before the async signal handling is set up -- it can't be made to generate interrupts before int_io_init() is called -- which in the USE_DLPI case of raw ethernet is where the I_SETSIG is applied to the ether fd.
OK -- the issue is probably that if you change to ITIMER_REAL you have to have a signal handler for SIGALRM instead of a handler for SIGVTALRM... did you change that too?
Yes... with ITIMER_REAL and SIGALRM it works the same on macOS as it does with ITIMER_VIRTUAL/SIGVTALRM.
well ... not on cygwin: the active section of int_timer_init()
is as follows in my local test version, but defining USE_ITIMER_REAL
in platform.h
for cygwin lets ldex
abort the startup (not defining it works ok):
struct itimerval timert;
struct sigaction timer_action;
timer_action.sa_handler = int_timer_service;
sigemptyset(&timer_action.sa_mask);
timer_action.sa_flags = SA_RESTART;
#ifdef USE_ITIMER_REAL
int intAlarm = SIGALRM;
int intTimer = ITIMER_REAL;
printf("** using ITIMER_REAL for timer\n");
#else
int intAlarm = SIGVTALRM;
int intTimer = ITIMER_VIRTUAL;
#endif
if (sigaction(intAlarm, &timer_action, NULL) == -1) {
perror("sigaction: SIGVTALRM");
}
/* then attach a timer to it and turn it loose */
timert.it_interval.tv_sec = timert.it_value.tv_sec = 0;
timert.it_interval.tv_usec = timert.it_value.tv_usec = TIMER_INTERVAL;
setitimer(intTimer, &timert, NULL);
for "making the ether fd async": i removed that from connectToHub()
as it also happens in the SUBRs ether_resume()
and ether_ctrlr()
, which both should be called after the Lisp system is up (and it still works of course)
You changed all the instances?
timer.c: if (sigaction(SIGALRM, &timer_action, NULL) == -1) {
timer.c: perror("sigaction: SIGALRM");
timer.c: sigaddset(&signals, SIGALRM);
timer.c: sigaddset(&signals, SIGALRM);
timer.c: sigaddset(&signals, SIGALRM);
timer.c: sigaddset(&signals, SIGALRM);
timer.c:/* Error handling routine for SIGALRM. Called when any */
timer.c:/* Set up the signal handler for SIGALRM, to catch TIMEOUTs: */
timer.c: if (sigaction(SIGALRM, &timer_action, NULL) == -1) {
timer.c: perror("sigaction: SIGALRM");
unixfork.c: sigaddset(&signals, SIGALRM);
unixfork.c: sigaddset(&signals, SIGALRM);
Inspecting the code, I believe the timer emulation code is going to fail if you compile with OPDISP.
no, not all instances, but i think there was no need for:
timer.c
/ int_timer_init()
: this is the code snippet in my previous post (ok, the perror()
still says "SIGVTALRM", but it's only a log message)timer.c
/ int_block()
: sigaddset
is already done for both SIGVTALRM
and SIGALRM
, so nothing changed heretimer.c
/ int_unblock()
: sigaddset
is already done for both SIGVTALRM
and SIGALRM
, so nothing changed heretimer.c
/ int_file_init()
: sigaddset
is only done for SIGALRM
, so nothing changed hereunixfork.c
/ fork_Unix()
: sigaddset
is already done for both SIGVTALRM
and SIGALRM
, so nothing changed hereThese were the only instances of the string "ALRM" that i found (except for comments or strings) in those files.
for the OPDISP: is this still used?
Scanning with grep
, it seems that is was used for the sunos-variants (Makefiles in bin/legacy
) and possibly with DOS, but it is not defined in platform.h
or in any Makefiles in bin
.
OPDISP with the threaded dispatch is new. It improves the performance.
OPDISP is not turned on by default yet, when we do turn it on, that could be done for systems other than Cygwin.
BTW -- where do you want bug reports for Dodo services submitted? The ECHO XIP packets that are returned by, for example, the file server implementation, don't match the echo packets it receives (missing the data, 2 bytes shorter than the one original sent)
See the commit log for
commit b7e9529322269553d6c7bdd4741677270959cf09
Author: Bruce Mitchener <bruce.mitchener@gmail.com>
Date: Sat Jan 23 03:28:16 2021 +0700
We replaced the assembler fast dispatch with gcc's (and clang's) computed goto extension.
bug reports please directly to issues in the Dodo project.
for OPDISP dispatch: if there is no common code segment where all instructions come through (i get lost each time i try to follow the #define
and goto
maze in xc.c
), then timer emulation may be excluded from fast dispathing. No problem, Maiko seems fast enough for me as it is now...
Missing include for <netinet/in.h>
prevents it compiling on FreeBSD.
clang -m32 -std=gnu99 -fno-strict-aliasing -c -O2 -g -DMAIKO_ENABLE_NETHUB -I/usr/local/include -DXWINDOW -DRELEASE=351 -I../inc/ -I../include/ ../src/ether.c -o ../freebsd.386-x/ether.o
../src/ether.c:260:22: error: variable has incomplete type 'struct sockaddr_in'
struct sockaddr_in hubaddr;
^
../src/ether.c:260:10: note: forward declaration of 'struct sockaddr_in'
struct sockaddr_in hubaddr;
^
1 error generated.
added the include and just created the merge request Dodo Nethub support #445
Running with cygwin works fine now. It would probably be useful for running on older Windows systems.
I am closing this issue -- is it worth a mention in documentation or a special build ?
Currently in maiko/cygwin2 branch:
*-*-cygwin*) echo cygwin ;;
len== strlen(dp->d_name)
would have been better (in dsk.c )now it compiles, but dies when run: