ghaerr / elks

Embeddable Linux Kernel Subset - Linux for 8086
Other
1.02k stars 108 forks source link

NEC PC-98 enhancements to ELKS #1238

Closed ghaerr closed 2 years ago

ghaerr commented 2 years ago

Continued conversation from #1047.

tyama501 commented 2 years ago

Hello @ghaerr ,

I modified bios_disk_rw() and simplified bios1B.S as follows. https://github.com/tyama501/elks/commit/ce55cce26d305cc78d5fee0aef1cced5a8919fcf

With this modification, I could read/write the partition > cylinder 1023.

So the next step is to modify hd_drive_map?

Thank you!

ghaerr commented 2 years ago

Hello @tyama501,

I modified bios_disk_rw() and simplified bios1B.S as follows.

Nice simplification of bios1B.S. I think, once hd_drive_map[] is modified, then your current changes to bioshd.c can also be simplified, so that hard-coded 0xA0, 0x90, 0x30 won't be required, but looked up calling map_drive.

After the full modifications, I assume in line 173 you won't be comparing against an IBM PC BIOS value of 0x80, but instead using the PC-98 PDA value. This could possibly be changed by calling map_drive at a later point.

So the next step is to modify hd_drive_map?

Yes, I think so.

Thank you!

tyama501 commented 2 years ago

Hello @ghaerr ,

I think I am almost done except for getting boot partition. Here is my modified code. https://github.com/tyama501/elks/compare/master...merge_pc98

It seems I need to modify the following part of drivers/block/genhd.c to set boot_partition which will be used in bioshd_conv_bios_drive. Do you have any idea what is the right way to set for this "if" condition?

if (ROOT_DEV == (0x80 | minor >> hd->minor_shift)) {
    sector_t boot_start = SETUP_PART_OFFSETLO | (sector_t) SETUP_PART_OFFSETHI << 16;
    if (start == boot_start)
        boot_partition = minor;
}

boot_partition

fd is bootable. boot_fd

ghaerr commented 2 years ago

Hello @tyama501,

I think I am almost done except for getting boot partition. Here is my modified code. https://github.com/tyama501/elks/compare/master...merge_pc98

Your changes look very nice. Perhaps use "<< 9" instead of "* 512" on lines 184, etc, as the emitted code should be smaller.

In line 186, (sector - 1) could result in 0xFFFF if sector == 0, you may need to say (sector - 1) & 0xFF?

Do you have any idea what is the right way to set for this "if" condition?

I think it should just use 0xA0 which is the PC-98 BIOS value for hard disk, rather than IBM PC's 0x80. Thus just the first line would change to:

if (ROOT_DEV == (0xA0 | minor >> hd->minor_shift)) {

This could possibly be done with a #define BIOS_HD_DRVNUM 0xA0 (or 0x80) for PC.

This should hopefully work, since it appears from your first screenshot above that BIOS drive 0xA1 is resulting in root device 0x320, when it should be 0x321.

ghaerr commented 2 years ago

Also, the following code is not really correct:

#ifdef CONFIG_ARCH_PC98
    for (minor = 0; minor < NUM_DRIVES; minor++) {
    if (biosdrive == hd_drive_map[minor]) break;
    }
    ...

It has two issues: first, it calculates "minor" by looping through up to 8 times, while minor is only valid when 0 <= minor <= 3. Thus, minor probably needs to be "anded with 3" outside the loop. The problem is that minor is only really useful for hard drive SCSI mapping, so perhaps only loop through 4 times.

Secondly, there is the possibly that a match isn't found (unlikely), in which case minor would be out of range.

tyama501 commented 2 years ago

Thank you @ghaerr ,

I modified like this but it seems there are a couple of problems.

ifdef CONFIG_ARCH_PC98

printk("\nROOT_DEV %x, minor %d, shift %d, start %ld\n", ROOT_DEV, minor, hd->minor_shift, start);
printk("OFFSETLO %d, OFFSETHI %d\n", SETUP_PART_OFFSETLO, SETUP_PART_OFFSETHI);
if (ROOT_DEV == (0xA0 | minor >> hd->minor_shift)) {

else

if (ROOT_DEV == (0x80 | minor >> hd->minor_shift)) {

endif

    sector_t boot_start = SETUP_PART_OFFSETLO | (sector_t) SETUP_PART_OFFSETHI << 16;
    if (start == boot_start)
        boot_partition = minor;
}

(1) SCSI_ID = 1 for boot device and there is no SCSI_ID = 0 device

ROOT_DEV = 0xA1 and (0xA0 | minor >> hd->minor_shift) = 0xA0 so it cannot set boot_partition

boot_partition3

(2) SCSI_ID = 1 for boot device and there is also SCSI_ID = 0 device

ROOT_DEV = 0xA1 and (0xA0 | minor >> hd->minor_shift) = 0xA1 so it can set boot_partition but root_device is incorrect. ( It should be 0x321)

boot_partition2

(3) SETUP_PART_OFFSETLO/SETUP_PART_OFFSETHI

Where the SETUP_PART_OFFSETLO/SETUP_PART_OFFSETHI values are set? Although 200 is correct value for the first partition but I couldn't find where it is set.

BTW, I purchased PC-9801BX and waiting to arrive. It has intel 486 and can use IDE hard drive, 1.44MB floppy. Although the IDE hard drive has been removed so I need to get one, we can consider supporting IDE hard drive which has device address 0x80 later.

ghaerr commented 2 years ago

Hello @tyama501,

I see, I was incorrect earlier. I'm thinking the code should read something like the following, which would use the BIOS drive mapping table to map the minor number to the BIOS PDA, which would then be compared to the boot PDA:

if (ROOT_DEV == (hd_drive_map[minor >> hd->minor_shift])) {

In this case, when ROOT_DEV (the boot BIOS drive ID) is 0xA1, it would associate that with the minor number that matches its map location, and things should work. (hd_drive_map will have to be declared external and not static). It seems this should work for both IBM PC and PC-98 without modification.

Where the SETUP_PART_OFFSETLO/SETUP_PART_OFFSETHI values are set? Although 200 is correct value for the first partition but I couldn't find where it is set.

These are set in the last lines of code in the FAT boot loader sect_boot_fat.h (part_offset, +2, previously stored as sect_offset, +2).

tyama501 commented 2 years ago

Hello @ghaerr ,

The modification worked for the (1) case. It could boot from 0xA1.

boot_A1

The (2) still have wrong root_dev which is 0x341. I added debug message and found it comes from the code below.

bioshd.c kdev_t INITPROC bioshd_conv_bios_drive(unsigned int biosdrive) { ... if ((biosdrive & 0xF0) == 0xA0) partition = boot_partition; ... return MKDEV(BIOSHD_MAJOR, (minor << MINOR_SHIFT) + partition); (<----- Here minor = 1, and partition = 33 (0x21)) }

Should the boot_partition be 0x01 instead of 0x21?

boot_A1_2

Thank you.

ghaerr commented 2 years ago

Hello @tyama501,

Should the boot_partition be 0x01 instead of 0x21?

Yes, it seems that you may have found a general bug in the ELKS boot partition identification code. The 'boot_partition' variable should only be the partition number (0-7), not the full minor number plus partition number. Thus, probably the following would fix this:

        sector_t boot_start = SETUP_PART_OFFSETLO | (sector_t) SETUP_PART_OFFSETHI << 16;
        if (start == boot_start)
            boot_partition = minor & 0x7;

I am sorry this has taken so long to figure out, but it would seem perhaps nobody has booted from other than partition 1 on a second drive before?

Should this get everything working, go ahead and clean up and submit a PR. I can then test on IBM PC.

Thank you!

tyama501 commented 2 years ago

Thank you @ghaerr ,

It's OK. I could boot with SCSI ID 0/1/2/3.

I will clean the code and make PR.

Thank you!

tyama501 commented 2 years ago

Add codes for bios_disk_rw, Simplified int1B, Modified bootcode for PC-98 #1251

Thank you for the merging!

tyama501 commented 2 years ago

Hello @ghaerr ,

I found you have added nice basic.

PC-98 has graphic routine called LIO In ROM which originally used for BASIC.

I have written ASM codes for LSI-C before to use these directly as following. https://github.com/tyama501/CLIO98

Is it interesting if I rewrote these codes with gcc-ia86 asm and use from your basic?

ghaerr commented 2 years ago

Hello @tyama501,

Is it interesting if I rewrote these codes with gcc-ia86 asm and use from your basic?

Sure, why not?! It seems everyone can have fun with an old-fashioned basic interpreter. I know I have :)

I have written ASM codes for LSI-C before to use these directly as following.

Perhaps you can use an approach similar to the INT 1Bh C->ASM interface so that most of the code stays in C, rather than ASM. You could also pass the interrupt number itself and use self-modifying code to change the INT to what it needs to be (A7h, ACh, etc). Or keep it all in assembly.

Having graphics on PC98 (and IBM PC) would be fun. If you want to jump into this, I would suggest that you keep the BASIC statements such as PLOT and DRAW (in this case) compatible with the original Sinclair BASIC, by reading the manual, implementing the BASIC parsing in basic.c, but calling out to the specific host implementation using new host_xxx routines only, like it does now for all external interfacing. This keeps the core interpreter very portable.

@cocus has already added code to play with buttons and lights on his board, as you may have seen. I am thinking that instead of keeping that code in host.c, we could create platform-specific .c files like host-8018x.c and host-pc98.c for hardware interfacing. The Makefile could then control which files get linked in.

I plan on continuing to make a few changes to make basic.c more compatible with some Sinclair BASIC, so that we might be able to run more of the text and graphics demos available from https://github.com/ZXDunny/SpecBAS-Demos. However, those demos use a non-Dartmouth BASIC method fo string manipulation (see Chapter 8 in manual), which is different than the LEFT$/RIGHT$/MID$ method which is already implemented. So we have already departed from strict compatibility from Sinclair BASIC.

Thank you!

tyama501 commented 2 years ago

Hello @ghaerr ,

Thank you for the information.

already added code to play with buttons and lights on his board, as you may have seen.

Wow, great video.

It seems the manual web page has security warning from mcafee web advisor and I couldn't see. So I searched another website. Maybe I can reference these site.

Thank you!

https://en.wikipedia.org/wiki/Sinclair_BASIC#Interpreters_for_the_ZX_Spectrum_family

https://worldofspectrum.org/ZXBasicManual/

ghaerr commented 2 years ago

Yes those references look great. I have reviewed the various Sinclair documentation I've found and will update the elkscmd/basic/README.md with a list of implemented commands and functions, as well as differences between our basic and Sinclair, and a list of so-far unimplemented commands and functions. We can then use that as our reference for syntax.

For graphics modes, we will need commands to enter and exit graphics mode(s). I will attempt to add them to the README.md as well.

On another note, I tested your PC-98 changes for determining boot partition on IBM PC, and the ifdef in genhd.c will not be needed, we can use your new code. I will change this in a small update PR with the basic readme.

tyama501 commented 2 years ago

Hello @ghaerr ,

I got PC-9801BX(intel 486) and I could boot it with FD1440 so I attached movie. This is the first time I tried FD1440 in the real PC. The drive can read 1440/1232 so maybe I will think how to probing the disk after I play with "basic" :)

The PC also support IDE hard drive but it has been removed so I need to get one.

https://user-images.githubusercontent.com/61556504/162612668-766c19e6-5dd9-4951-ad24-8b09c71fe4f8.mp4

Thank you!

ghaerr commented 2 years ago

Hello @tyama501,

Thanks for the movie, super nice ELKS is working well on your new system!

The PC also support IDE hard drive but it has been removed so I need to get one.

Your recent work on keeping physical device address (PDA) in hd_drive_map should work well for IDE hard drive, with the following changes possibly needed:

after I play with "basic" :)

I've added all the new functions I planned on, so you should be good to go for adding graphics commands/functions if you like. Our documentation now shows possible syntax for unimplemented commands and functions. It shows a (non-standard) MODE command that will probably be needed in ELKS to switch between text and graphics modes, whose BIOS number will likely be different between PC-98 and IBM PC.

Thank you!

tyama501 commented 2 years ago

Hello @ghaerr ,

Thank you for the comments.

I have started to add host-pc98.c as follows but it seems I got memory problem. https://github.com/tyama501/elks/commit/4116b4354f37e669f9359e20f6eab5daabb8bf45

LIO needs about 5.2KByte working memory but it seems failed to allocate. ELKS_basic_lio

If I decreased to 4KByte it can be allocated but If I tried PLOT x,y the stack overflow occurred. Are they too big memory to handle for ELKS?

ELKS_basic_lio_segmentation

ghaerr commented 2 years ago

Hello @tyama501,

LIO needs about 5.2KByte working memory but it seems failed to allocate. If I decreased to 4KByte it can be allocated but If I tried PLOT x,y the stack overflow occurred. Are they too big memory to handle for ELKS?

Your program seems fine - the problem is that the standard heap allocation for ELKS programs is 4K (with an additional 4K default for stack). This should be easily fixed by adding -maout-heap=8192 (or similar, depending on requirements) to the Makefile link line. You can see this is done for many other programs in elkscmd/*/Makefile. A stack value can be increased or decreased from the default using -maout-stack=.

You can also modify a programs heap or stack request at ELKS runtime by using chmem -h 8192 /bin/basic. Running chmem /bin/basic will show the current allocation, with 0 meaning default.

Thank you!

ghaerr commented 2 years ago

I have started to add host-pc98.c as follows

Looks good. I can see we're going to need a better way of adding commands and functions to basic.c/basic.h. I am working on adding READ/DATA/RESTORE, and you've got PLOT and LIO98INI, using conditional compilation. The interpreter wants to have a sequential list in basic.h, so that would mean conditional compilation would always have to come last, which will likely turn into a problem.

I'll try to think up a better answer for this, perhaps adding the ability for the tokenizer to use seperate (conditionally compiled) token tables. Even with my recent work on READ/DATA etc, I'm finding having to match the token name/position/number with basic.h a bit tedious.

Another discussion point, do you like the idea of having a separate LIO98INI statement, or should basic just automatically initialize this when compiled in? Does this command change the screen into graphics mode? If so, perhaps we should use a generic "MODE [n]" or "GRAPHICS [n]" statement, which sets one of several possible graphics modes. On the IBM PC, there are a number of possible BIOS graphics modes.

Thank you!

tyama501 commented 2 years ago

Thank you @ghaerr ,

After -maout-heap=8192, the PLOT worked at least for the emulator. ( For real PC bios I might need to add some interrupt handler that is called from inside the bios )

ELKS_basic_lio_plot

Does this command change the screen into graphics mode? If so, perhaps we should use a generic "MODE [n]" or "GRAPHICS [n]" statement, which sets one of several possible graphics modes.

I am not sure it is called graphic mode but I used int A1/A2/A3 to set 640x400 screen, views, 16 colors. Maybe we can move these to "MODE [n]" or "GRAPHICS [n]".

Thank you!

tyama501 commented 2 years ago

Oh, is there plan to add "INK" command?

Since there is no INK command I used white color.

ghaerr commented 2 years ago

I am not sure it is called graphic mode but I used int A1/A2/A3 to set 640x400 screen, views, 16 colors. Maybe we can move these to "MODE [n]" or "GRAPHICS [n]".

Yes, go ahead and rename your LIO98INI command to MODE for the time being. It can be (later?) enhanced to take a number argument, which would be used to switch between the "normal, text" mode and perhaps several graphics modes. We could use MODE 0 to mean normal text, and any other number to be system specific. (IBM PC has BIOS text modes of 3 and 7, for instance, and other numbers could be used to select various resolutions and/or colors). You can use whatever numbering you would like for selecting the PC-98 graphics mode(s).

On the IBM PC, BASIC will probably have to switch back to MODE 0 before exiting, so that the shell prompt can be seen. This can all be worked out in more detail later.

Oh, is there plan to add "INK" command?

It looks like it is needed, if you like go ahead and implement "INK ", used to select the foreground drawing color.

For back porting into the source tree, I will try to devise a separate token array for these graphics commands, which will make it easier for you to maintain this work, as other non-graphics commands are added to the original token array. If each new token is associated with a separately numbered TOKENGR_xxx and associated hostgr_xxx, then we won't have to ifdef each of the calls in the main interpreter. Instead, the second token array will always be compiled in, and the interpreter will call out to each hostgr_xxx routine, which will be stubbed out in host-stubs.c for those systems that do not include graphics (like the IBM PC initially). The TOKENGR_xxx numbering could start at a fixed value, say 128, to keep things portable.

tyama501 commented 2 years ago

Hello @ghaerr ,

On the IBM PC, BASIC will probably have to switch back to MODE 0 before exiting, so that the shell prompt can be seen. This can all be worked out in more detail later.

Actually PC-98 text vram plane and graphic vram plane are separated and text is overlaid on the graphic so there is no text mode. Maybe we need to clear graphic plane when exiting basic.

I will rename to MODE but setting interrupts and allocating memory should be done only once so I will check if it has been done.

I also need to add intC5 interrupt handler since the real PC LIO Bios is calling this periodically. The stack overflow I mentioned before was occurring because I was using real PC Bios data and there was no handler. (NekoProject emulator has the host LIO function so after I removed the data it worked)

I think I have read somewhere that this handler is prepared to handle escape sequence like Ctrl-C but maybe just iret is ok for now.

Do you think implementing the handler can be done with attribute((interrupt)) void interrupt_handler(struct interrupt_frame* frame){} according to the following? If there is more good way please tell me.

https://wiki.osdev.org/Interrupt_Service_Routines#GCC_.2F_G.2B.2B

Thank you.

ghaerr commented 2 years ago

Hello @tyama501,

text is overlaid on the graphic so there is no text mode. Maybe we need to clear graphic plane when exiting basic.

That's ok, I think better to just leave screen as-is when exiting. At least, only MODE command should change screen. For PC-98, looks like no changes will be required. We can worry about IBM PC at time someone implements that.

I think I have read somewhere that this handler is prepared to handle escape sequence like Ctrl-C but maybe just iret is ok for now.

I'm not following you on the ^C issue - the ELKS kernel should handle all keyboard I/O, and we need to be careful that anything added to BASIC does not interfere with what kernel expects to be handling.

I also need to add intC5 interrupt handler since the real PC LIO Bios is calling this periodically.

That should be fine, as long as interrupt handling is only related to LIO and/or BIOS callbacks, which are never handled by kernel. Of course, should BASIC exit or crash without resetting that interrupt vector the system will be in an unstable state. This means an atexit() handler should probably be used to ensure vectors are reset.

Do you think implementing the handler can be done with attribute((interrupt)) void interrupt_handler(struct interrupt_frame* frame){} according to the following?

No - I seem to remember the big problem with using C for an actual interrupt handler is that GCC always assumes that SS=DS on interrupt entry, as it sometimes generates code using (BX) or (BP) to access C variables, and each use DS and SS register respectively. Much better to write your interrupt handler in ASM, and add to project. You will need to be very careful in handler if accessing C variables for the same reason. The handler cannot make any assumptions about any register contents, and should not switch stacks, as the interrupt may occur even when BASIC is switched out by kernel and not running. My recomendation is to push as few registers as possible, and IRET as soon as possible. We can talk more about this after I understand exactly what handler has to do.

Also, when interrupt vectors are set or restored, CLI/STI should be performed for safety. We are running as user program, not as kernel, so things are definitely messier.

Thank you!

cocus commented 2 years ago

Regarding the BASIC keywords, I checked and I think we can split them to a different source file, but the problem is the actual parsing of these keywords. For instance, we can extend the keyword array somehow (like making each source file to declare an array that lands on a specific program segment of the elf, and let the base code traverse thru all of them on runtime), but it would be quite inconvenient to do this for the parsing of arguments. It might be doable of we expect an api from the platform it'll be compiled against, some sort of hooks into the appropriate base functions of the basic interpreter. Worst case scenario would be to flood the source with ifdefs.

I'm still a little bit curious about the video initialization stuff on the pc98 and why you can't re-initialize if needed. As @ghaerr mentioned, if this is somehow used, it'll have to revert back to the previous non-overlaid graphics on top of text mode when breaking or exiting the interpreter

ghaerr commented 2 years ago

I checked and I think we can split them to a different source file, but the problem is the actual parsing of these keywords.

Agreed. I've come to the conclusion that it likely isn't worth the effort to try to split the parsing apart, and that the keywords can stay compiled in with no ifdefs, as well. We don't have enough complexity to justify writing all that code. Frankly, the interpreter is just beginning to be actually useful. (Although I've also realized there is no such thing as a truly portable BASIC program).

Worst case scenario would be to flood the source with ifdefs.

Actually, I think the simplest and best answer is to no use ifdefs at all, for the keywords nor the parsing (or the host_xxx routines). Have the parsed command call the host_xxx function, which will be outside basic.c. The Makefile can then control whether a stub function is called (which could display an error or not implemented), or the platform-specific function used. This way, the interpreter recognizes and tokenizes all possible keywords, which would allow those BASIC programs to at least be read in and LISTed. Having different versions of the interpreter not even recognize ELKS-specific platform command and function variations will lead to a mess.

cocus commented 2 years ago

Actually, I think the simplest and best answer is to no use ifdefs at all, for the keywords nor the parsing (or the host_xxx routines). Have the parsed command call the host_xxx function, which will be outside basic.c. The Makefile can then control whether a stub function is called (which could display an error or not implemented), or the platform-specific function used. This way, the interpreter recognizes and tokenizes all possible keywords, which would allow those BASIC programs to at least be read in and LISTed. Having different versions of the interpreter not even recognize ELKS-specific platform command and function variations will lead to a mess.

Oh, I think stubbing the functions not used on the other platforms would make a lot more sense. Not from the actual user perspective, but oh well!

ghaerr commented 2 years ago

Not from the actual user perspective, but oh well!

What should the stubbed function do? Display an error or ignore? On or the other would effectively be the same as if the command/function were not compiled in, from the user perspective.

cocus commented 2 years ago

What should the stubbed function do? Display an error or ignore? On or the other would effectively be the same as if the command/function were not compiled in, from the user perspective.

Yes, stating that keyword isn't valid should be enough

tyama501 commented 2 years ago

Hello @ghaerr , @cocus ,

I'm not a native English speaker so maybe I need some time to understand the discussion.

So the conclusion for now is not to use ifdef but use stubbed function?

I also thinking move codes of LIO98INIT to MODE1 and use MODE0 to free memory after I implemented the interrupt handler.

Thank you.

ghaerr commented 2 years ago

Hello @tyama501,

So the conclusion for now is not to use ifdef but use stubbed function? I also thinking move codes of LIO98INIT to MODE 1 and use MODE 0 to free memory after I implemented the interrupt handler.

Yes on both counts. Thank you!

I am also curious as to how fast the graphics drawing is using BASIC. Is the square from your example drawn very quickly?

tyama501 commented 2 years ago

No, it is slow.

There is GET/ PUT function. If we capture region to memory with GET, PUT may be faster.

ghaerr commented 2 years ago

Hello @tyama501,

It would be interesting to know whether most of the speed problem is within the BASIC interpreter, or the graphics draw routine(s). I picked this particular interpreter because it tokenized input and was able to run using a small amount of fixed memory. The interpreter should be fairly fast, but there may be ways to greatly increase the speed - for instance by allowing computation in integers in some cases instead of floats. The PLOT function has to convert from float to integer for every operating, and all other calculations are done in (software) float.

If we capture region to memory with GET, PUT may be faster.

Is this a PC-98 graphics issue? I wasn't able to read any of the documentation on PC-98 LIO.

Thank you!

tyama501 commented 2 years ago

Hello @ghaerr ,

I will take video with the real PC so you can judge whether the drawing is slow or not but somehow I'm still have trouble setting the interrupt table for iret asm code.

I wil try set it from C code later.

tyama501 commented 2 years ago

Hello @ghaerr ,

I still suffering with STACK OVER MINSTACK when I used real LIO Rom and I'm not sure it is related to the interrupt related issue.

I traced kernel stack but it seems ok so this STACK is different with ks right?

I increased to -maout-stack=16384 but it seems no difference.

Is there anyway to debug?

STACK_OVER_MINSTACK

STACK_OVER_MINSTACK_2

Somehow I couldn't link .global asm fucntion so I wrote C code for temporary.

void int1c_handler(void) { printf("int1C\n"); asm volatile ("iret"); }

// Set interrupt handler for 0x1C
//int1c_set();
int1c_h_p = int1c_handler;
intvec = (unsigned long __far *) 0x00000314;
*intvec = (unsigned long) ((unsigned long __far *) int1c_h_p);

Thank you.

tyama501 commented 2 years ago

I mean intC5 not int1C... image

ghaerr commented 2 years ago

Hello @tyama501,

I traced kernel stack but it seems ok so this STACK is different with ks right?

Yes, we are talking about the application stack here, not the kernel stack. The kernel stack is switched to during a hardware or software interrupt, otherwise the application runs with it's own stack.

A document describing the application heap and stack is at https://github.com/jbruchon/elks/blob/master/Documentation/text/binformat.txt (first diagram):


 +------------+--------------------------------------------------+
 | data + bss |  heap ---> |                          <--- stack |
 +------------+--------------------------------------------------+
 ^            ^            ^                                     ^
 0x0  current->t_enddata  current->t_endbrk                  current->t_endstack

I increased to -maout-stack=16384 but it seems no difference.

The message "STACK OVER MINSTACK by XXX BYTES" means that the stack pointer is below the minimum reserved stack amount (set in -maout-stack= or otherwise 4096) by XXX bytes. Normally this means too much stack has been used, and the stack value needs to be increased. I think in your case this means that SP is somehow being set to an incorrect value by either INT C5h or our LIO98INI routine. It could mean that the BIOS is using excessive stack without telling anyone. This is already happening on PC-98 BIOS disk I/O.

Your STRACE shows that the heap allocation of 6144 bytes by the sbrk system call, is being granted. What is the value of your -maout-heap= option? This needs to be set to at least 6144, 10240 would be much better. You can use chmem /bin/basic to display both the stack and heap default values. After the sbrk returning success with 0, the STACK OVER MINSTACK is displayed, which is checked by the kernel routine stack_check, likely at the next timer interrupt. Thus, it seems that the malloc succeeded, but then something afterwards is setting SP to a bad value. It is also possible that the INT C5H routine is using up lots of stack, which then, during the next clock tick, is causing the kernel message. Perhaps set the process stack value much larger, to 20-32K and see if this still happens. Is there documentation on INT C5H anywhere?

The binformat.txt documentation link above shows that the heap grows up from the data segment (DS) end value, and the stack is set at a fixed value at start, and grows down. Thus, there is a fixed amount of heap+stack available. When the stack pointer goes into the application's BSS or DATA area, STACK OVERFLOW is printed. Wen the stack is below (meaning greater, more stack used) than the minimum stack reserved, STACK OVER MINSTACK is printed. When an sbrk is attempted that goes into the minimum stack reserved area, "sys_brk fail: brk over by XXX bytes" is printed.

You might comment out the INT C5H call and see whether this still persists. The remainder messages in the first screenshot show that the application stack is completely corrupted, as the application ends up printing its startup banner again; it has effectively crashed.

Thank you!

ghaerr commented 2 years ago

Another quick thought is to ensure that DS gets properly saved and restored across your __asm__ volatile ("mov %0,%%ds;"...) call. Disassembly of host-pc98.o using ia16-elf-objdump -D -r -Mi8086 will show that.

tyama501 commented 2 years ago

Hello @ghaerr ,

Thank you for the advice.

What is the value of your -maout-heap= option?

-maout-heap=6144 I needed to allocate about 5.2KB work area.

Is there documentation on INT C5H anywhere?

I have read this on a technical book before but I could only find the article below.

http://argus.sblo.jp/article/868180.html

It says for STOP key.

You might comment out the INT C5H call and see whether this still persists.

This STACK issue was happening before I add INTC5H. I thought this can be solved by adding it but it didn't.

Another quick thought is to ensure that DS gets properly saved and restored

I will try this later.

Thank you!

ghaerr commented 2 years ago

-maout-heap=6144

Try increasing to 10240, as malloc has some overhead and is requesting 6144 as seen in your STRACE log. It worked, but that leaves zero margin between heap and stack.

I have read this on a technical book before but I could only find the article below.

That article shows all registers being saved before INT call, perhaps you may have to do this also. C requires that DS, ES, BP, SI and DI be saved, and may at times consider other registers to be valid as well. This could possibly be fixed using the third parameter to the _asm volatile call to specify all registers changed, not sure. That article also shows SS and SP being saved and popped, (which is very strange as it would perform a stack switch!!), but perhaps that is required.

tyama501 commented 2 years ago

Thank you @ghaerr ,

Only saving DS, ES, BP, SI and DI worked! https://github.com/tyama501/elks/commit/592e5fdac35429280638cd94064983bfa357219b

I started to merge your changes and removed ifdefs. I could build this. https://github.com/tyama501/elks/commit/8d521a37463aeca7eff4f0dd6477f8a9fd177e08

How the Makefile should be? Now like this.

I am using both host-stubs and host-pc98.

ifeq ($(CONFIG_ARCH_8018X), y) OBJS += host-8018x.o else OBJS += host-stubs.o endif

ifeq ($(CONFIG_ARCH_PC98), y) OBJS += host-pc98.o endif ...

basic: $(OBJS) basic.h host.h $(LD) $(LDFLAGS) -maout-heap=10240 -maout-stack=6144 -o basic $(OBJS) $(LDLIBS)

Thank you!

ghaerr commented 2 years ago

Hello @tyama501,

Only saving DS, ES, BP, SI and DI worked!

Great news. So it was a register overwrite problem then, rather than actually stack related.

I am using both host-stubs and host-pc98.

I think probably best to copy host-stubs.c functions into your host-pc98.c, and only use host-pc98.c for your build. Then also copy stub functions from host-pc98.c (host_mode and host_plot) back into host-stubs.c for the normal build, which will call empty stub functions for MODE, PLOT and any other platform-specific commands.

@cocus has recommended that the non-implemented stub functions display "MODE not implemented" etc in host-xxx.c files. This can be done later on the IBM PC and 8081X builds though. @cocus will also have to copy stub functions into his host-8018x.c file to get his build working again. This method involves some copying, but should allow each platform maintainer to have a single platform-specific file for their implementation. I can fix up the host-stubs.c file for IBM PC after your PR.

How the Makefile should be?

Probably something like this:

ifeq ($(CONFIG_ARCH_IBMPC), y)
OBJS += host-stubs.o
endif

ifeq ($(CONFIG_ARCH_8018X), y)
OBJS += host-8018x.o
endif

ifeq ($(CONFIG_ARCH_PC98), y)
OBJS += host-pc98.o
HEAP=-maout-heap=10240
endif
...

basic: $(OBJS) basic.h host.h
$(LD) $(LDFLAGS) $(HEAP) -o basic $(OBJS) $(LDLIBS)

I recommend removing your -maout-stack=6144 directive as it should not be required to allocate special stack space for your build; the heap value of 10240 will cover your malloc.

Thank you!

tyama501 commented 2 years ago

Hello @ghaerr ,

OK! Good. I will do that.

Maybe I will add graphic clear function to host_cls in host-pc98.c.

Thank you.

cocus commented 2 years ago

@cocus will also have to copy stub functions into his host-8018x.c file to get his build working again.

Yes, no worries. For the 8018x port we'll discuss the functions available there and not related to my particular board, but rather the generic 80c18x's.

Only saving DS, ES, BP, SI and DI worked! tyama501@592e5fd

Maybe this is a question for @ghaerr, but I think this could crash the machine because the pointer of the interrupts point to the basic's app cs:ds could become invalid if the app exits non gracefully, doesn't the OS have a better mechanism to do this from the userspace?

I'm not sure about the PC98 specifics, but maybe the STOP key initialization and handling could be done on the kernel and process it as a signal, so that apps could consume it. Except if it's tied to the graphics in which case it might not make sense to do it.

ghaerr commented 2 years ago

Hello @cocus & @tyama501,

I think this could crash the machine because the pointer of the interrupts point to the basic's app cs:ds could become invalid if the app exits non gracefully

Definitely - if the application exits due to an unhandled signal or other reason, and the (not yet coded) atexit routines which would reset the interrupt vectors not get executed, the interrupt handlers could be overwritten by the next executed application and the system crash as a result.

maybe the STOP key initialization and handling could be done on the kernel and process it as a signal, so that apps could consume it.

I had been thinking that the kernel is the proper place to handle any interrupt callbacks. Adding your idea of handling the STOP key as a signal or keystroke is even better. This could be done as an add-on to the PC-98 console initialization code.

Except if it's tied to the graphics in which case it might not make sense to do it.

This could be handled with a separate PC-98 config option, if needed.

@tyama501, what exactly is the function of the STOP key, and its associated handler?

Also, on another issue, I'm thinking that in your LIO98INIT routine, rather than using malloc to dynamically allocate memory, why not just use a statically-allocated buffer? The entire rest of BASIC is designed such that no dynamic allocations are used, and this would further simplify the routine, especially since it's never freed anyways.

What is the function of the 5200 byte buffer, is there any documentation on it?

Thank you!

tyama501 commented 2 years ago

Hello @ghaerr , @cocus ,

Thank you for your suggestions!

I modified to use static-allocated buffer and rename functions to mode and plot. I am planning to add draw and color. https://github.com/tyama501/elks/compare/8d521a37...a8552de

LIO was used in the original N88-Basic but I'm not sure how the working memory are used except the parameters are set at the lower address. I also removed int1C handler for now since I don't know what should be handled. It seems int1C is called when graphic routine took to much time. (The original N88-Basic might be checking STOP key to stop running program in the handler but I'm not sure. ) So far it seems working without it.

BTW is there a way to stop running program without killing basic?

ghaerr commented 2 years ago

Hello @tyama501,

Thank you for your suggestions!

Thank you for your changes, they look quite nice. Things seem much simpler and ready for an initial commit when you are ready.

https://github.com/tyama501/elks/compare/8d521a37...a8552de

Super. I am going to recommend a small change to not using uint16_t for host_mode, just int for better portability. It is better to keep with straight int and long when possible (outside the interpreter) as we also have the capability to run BASIC on Linux or macOS host for easier testing (unfortunately not for graphics though!). I will indicate these changes directly on your branch.

I am planning to add draw and color.

Super. Would you like to commit this as a PR first, and then add enhancements afterwards? I am OK with commit with a couple small changes mentioned above.

So far it seems working without it.

We can worry about INT 1Ch and full understanding of need for 5200 byte buffer at a later time, if needed.

BTW is there a way to stop running program without killing basic?

This is on my list, and much needed. I could start work on it after your commit so as to not interfere.

Since INT 1Ch is now not pointed at application memory, and int Axh vectors pointing only to ROM, are you concerned about exiting BASIC after graphics mode? Graphics mode stays on at all times on PC-98, correct? Or does it switch to another mode when executing "MODE 1" in BASIC?

Thank you!

ghaerr commented 2 years ago

For some reason, I could not comment on your branch directly. Here are the small changes I'd like to see:

In host.h:

void host_mode(int mode);

In basic.c:

int parse_MODE() {
    getNextToken();
    int val = expectNumber();
    if (val) return val;    // error
    if (executeMode) {
        int mode = (int)stackPopNum();
        host_mode(mode);
    }
    return 0;
}
ghaerr commented 2 years ago

Also @tyama501, you will have to add do-nothing stub functions host_mode and host_plot to host-stubs.c in order to not break the standard CONFIG_ARCH_IBMPC build, in your PR.