stardot / b-em

An opensource BBC Micro emulator for Win32 and Linux
http://stardot.org.uk/forums/viewtopic.php?f=4&t=10823
GNU General Public License v2.0
112 stars 57 forks source link

80x86 IDIV instruction bugs with 8-bit operand #161

Closed hoglet67 closed 2 years ago

hoglet67 commented 2 years ago

The test case I'm using is +/- 14 IDIV8 +/- 3

   10 INPUT A$
   20 MODE 0
   30 PRINT "IDIV8 TEST: ";A$'
   40 DIM code% 100
   50 P%=code%
   60 [
   70 .test
   80 IDIV BL
   90 RETF
  100 ]
  110 FOR I%=0 TO 3
  120   IF I%<2 A%=14 ELSE A%=-14
  130   IF I%=0 OR I%=2 B%=3 ELSE B%=-3
  140   Z%=USR(test) AND &FFFF
  150   PRINT
  160   PRINT "Dividend  = "~A%
  170   PRINT "Divisor   = "~B%
  180   PRINT "Quotient  = "~(Z% AND &FF)
  190   PRINT "Remainder = "~((Z% DIV &100) AND &FF)
  200 NEXT

Screenshot from 2021-10-31 10-25-52

Here's the expected result: matchbox_idiv8

Here's the current result from B-EM: Screenshot from 2021-10-31 10-22-59

I'm not sure much software uses the IDIV instruction.

See also the PiTubeDirect issue, where I've just fixed IDIV8/16 both being broken. https://github.com/hoglet67/PiTubeDirect/issues/127

Dave

SteveFosdick commented 2 years ago

Dave, do you agree https://github.com/stardot/b-em/tree/sf/issue161 fixes this. It passes your test case.

hoglet67 commented 2 years ago

A couple of thoughts...

  1. I think the dividend should be AX (16 bits) not AL (8 bits). It's only passing in my test case because the numbers are small (which was rather silly of me....) The comment /IDIV AL,b/ is incorrect, the instruction is actually /IDIV AX,b/

  2. The code checks for the division by zero exception, but not the overflow exception, where the quotient lies outside -128 to +127. The other form of IDIV (that does 32 bit DIV/MOD 16 bit) also lacks this overflow check.

  3. You are relying on the C compiler truncating towards 0 when doing signed division. This is the behaviour with C99, but is not guaranteed with C89. I'm not sure if this matters to you. As long as all platforms are using C99 then it should be safe.

SteveFosdick commented 2 years ago

Ok, leaving the overflow question for a moment, I have modified the test program to also use some bigger numbers:

AUTO
INPUT A$
DIM code% 100
P%=code%
[
.test
IDIV BL
RETF
]
MODE 0
PRINTTAB(2,1);"IDIV8 TEST: ";A$'
PROCtest(2,3,14,3)
PROCtest(2,8,14,-3)
PROCtest(2,13,-14,3)
PROCtest(2,18,-14,-3)
PROCtest(30,3,1400,127)
PROCtest(30,8,1400,-127)
PROCtest(30,13,-1400,127)
PROCtest(30,18,-1400,-127)
PRINTTAB(0,23);
END
DEFPROCtest(X%,Y%,A%,B%)
Z%=USR(test) AND &FFFF
PRINTTAB(X%,Y%);"Dividend  = "~A%
PRINTTAB(X%,Y%+1);"Divisor   = "~B%
PRINTTAB(X%,Y%+2);"Quotient  = "~(Z% AND &FF)
PRINTTAB(X%,Y%+3);"Remainder = "~((Z% DIV &100) AND &FF)
ENDPROC

As I don't have the option of running it on MatchBox at the moment I have tried it on DOSBox instead. Interestingly, that does not seem to have as many lines in it's emulated MODE 0 as the real hardware so there isn't room on the screen for the assembly listing too, hence the change of order of the lines at the start of the test program, but here are the results: dosbox Now running on the BBC Master 512 having not changed any code from my previous attempt: b-em So that does at least show the current fix failing with a 16bit dividend. From what you say about the different C standards, the other approach would be to do the division as unsigned and then adjust the sign afterward - it may not even be any more code.

SteveFosdick commented 2 years ago

Ok, working with unsigned numbers turned out to be a bit trickier than I expected to I have substantially copied PiTube direct here. The relevant commit is: https://github.com/stardot/b-em/commit/8dc64c41d2b73eb5dcd405e5116a3e53b898ebbd

b-em_pitube

hoglet67 commented 2 years ago

This looks good to me.

I'll try the extended test on the Matchbox Co Pro later. We should also post it on the forum and see if we can get it run on a read 80186 Co Pro.

One final thought: there is a secod version if IDIV that takes a 32-bit dividend (in DX/AX) , a 16 bit divisor and returns the 16-bit quotion in AX and the 16-bit remainder in DX. This is still using signed arithmetic, but I'm not sure this is actually a problem.

FYI, here's the DOSBOS source for IDIV8 which uses signed division in C.

SteveFosdick commented 2 years ago

Does the 16bit version also produce a divide exception when the quotient won't fit into 16 bits?

This will be a little harder to test with a test case written in BASIC as the default vector for INT0 points into DOS+ which terminates the BASIC interpreter and prints its own message.

We could probably extend the bit written in assembler to handle that bit, i.e. point the vector back to the test code if we know how much to wind the stack back before returning to BASIC.

SteveFosdick commented 2 years ago

Ok, here is a test program for dealing with the exceptions. My 8086 assembler is very rusty but I got there in the end:

   10 MODE0
   20 INPUT"Machine under test";A$
   30 DIM code% 100
   40 P%=code%
   50 [
   60 .except
   70 ADD SP,6
   80 MOV AX,&FFFF
   90 RETF
  100 .test
  110 PUSH ES
  120 PUSH AX
  130 MOV AX,0
  140 MOV ES,AX
  150 MOV WORD PTR ES:[0],except
  160 MOV WORD PTR ES:[2],CS
  170 POP AX
  180 POP ES
  190 IDIV BL
  200 RETF
  210 ]
  220 PRINT"Press any key for safe tests...";
  230 G%=GET
  240 CLS
  250 PRINTTAB(2,1);"IDIV8 TEST: ";A$
  260 PROCtest(2,3,14,3)
  270 PROCtest(2,8,14,-3)
  280 PROCtest(2,13,-14,3)
  290 PROCtest(2,18,-14,-3)
  300 PROCtest(30,3,1400,127)
  310 PROCtest(30,8,1400,-127)
  320 PROCtest(30,13,-1400,127)
  330 PROCtest(30,18,-1400,-127)
  340 PRINTTAB(0,23);"Press any key for exceptions...";
  350 G%=GET
  360 CLS
  370 PRINTTAB(2,1);"IDIV8 TEST: ";A$
  380 PROCtest(2,3,3,0)
  390 PROCtest(2,8,3000,3)
  400 PRINTTAB(0,23);
  410 END
  420 DEFPROCtest(X%,Y%,A%,B%)
  430 Z%=USR(test) AND &FFFF
  440 PRINTTAB(X%,Y%);"Dividend  = "~A%
  450 PRINTTAB(X%,Y%+1);"Divisor   = "~B%
  460 IFZ%=&FFFFTHENPROCexcept ELSEPROCresult
  470 ENDPROC
  480 DEFPROCexcept
  490 PRINTTAB(X%,Y%+2);"Quotient  = EXCEPTION"
  500 PRINTTAB(X%,Y%+3);"Remainder = EXCEPTION"
  510 ENDPROC
  520 DEFPROCresult
  530 PRINTTAB(X%,Y%+2);"Quotient  = "~(Z% AND &FF)
  540 PRINTTAB(X%,Y%+3);"Remainder = "~((Z% DIV &100) AND &FF)
  550 ENDPROC
SteveFosdick commented 2 years ago

And here is a version of the test program that does the 16-bit IDIV instruction. This shows the version in B-Em failing to raise the exception.

   10 MODE0
   20 DIM code% 100
   30 DIMR% 2
   40 P%=code%
   50 [
   60 .except
   70 ADD SP,6
   80 MOV AX,&FFFF
   90 RETF
  100 .setup
  110 PUSH ES
  120 PUSH AX
  130 MOV AX,0
  140 MOV ES,AX
  150 MOV WORD PTR ES:[0],except
  160 MOV WORD PTR ES:[2],CS
  170 POP AX
  180 POP ES
  190 RET
  200 .test8
  210 CALL setup
  220 IDIV BL
  230 RETF
  240 .test16
  250 CALL setup
  260 IDIV BX
  270 MOV WORD PTR [R%],DX
  280 RETF
  290 ]
  300 INPUT'"Machine under test";A$
  310 CLS
  320 PRINTTAB(2,1);"IDIV8 TEST: ";A$
  330 PROCtest(2,3,14,3)
  340 PROCtest(2,8,14,-3)
  350 PROCtest(2,13,-14,3)
  360 PROCtest(2,18,-14,-3)
  370 PROCtest(30,3,1400,127)
  380 PROCtest(30,8,1400,-127)
  390 PROCtest(30,13,-1400,127)
  400 PROCtest(30,18,-1400,-127)
  410 PROCready("8-bit exceptions")
  420 PRINTTAB(2,1);"IDIV8 TEST: ";A$
  430 PROCtest(2,3,3,0)
  440 PROCtest(2,8,3000,3)
  450 PROCready("16-bit safe tests")
  460 PRINTTAB(2,1);"IDIV16 TEST: ";A$
  470 PROCtest16(2,3,1400,300)
  480 PROCtest16(2,8,1400,-300)
  490 PROCtest16(2,13,-1400,300)
  500 PROCtest16(2,18,-1400,-300)
  510 PROCtest16(30,3,467376,300)
  520 PROCtest16(30,8,467376,-300)
  530 PROCtest16(30,13,-467376,300)
  540 PROCtest16(30,18,-467376,-300)
  550 PROCready("16-bit exceptions")
  560 PRINTTAB(2,1);"IDIV16 TEST: ";A$
  570 PROCtest16(2,3,467376,0)
  580 PROCtest16(2,8,467376,3)
  590 PRINTTAB(0,23);
  600 END
  610 DEFPROCready(P$)
  620 PRINTTAB(0,23);"Press any key for ";P$;"...";
  630 G%=GET
  640 CLS
  650 ENDPROC
  660 DEFPROCtest(X%,Y%,A%,B%)
  670 Z%=USR(test8) AND &FFFF
  680 PROCresult(Z%,Z% AND &FF,(Z% DIV 256) AND &FF)
  690 ENDPROC
  700 DEFPROCtest16(X%,Y%,A%,B%)
  710 D%=A%DIV65536
  720 Z%=USR(test16) AND &FFFF
  730 PROCresult(Z%,Z%,!R%)
  740 ENDPROC
  750 DEFPROCresult(Z%,Q%,R%)
  760 PRINTTAB(X%,Y%);"Dividend  = "~A%
  770 PRINTTAB(X%,Y%+1);"Divisor   = "~B%
  780 IFZ%=&FFFFTHENPROCexcept ENDPROC
  790 PRINTTAB(X%,Y%+2);"Quotient  = "~Q%
  800 PRINTTAB(X%,Y%+3);"Remainder = "~R%
  810 ENDPROC
  820 DEFPROCexcept
  830 PRINTTAB(X%,Y%+2);"Quotient  = EXCEPTION"
  840 PRINTTAB(X%,Y%+3);"Remainder = EXCEPTION"
  850 ENDPROC
SteveFosdick commented 2 years ago

Some results from B-Em after the exception was added: 16bitsafe 16bitex And from DOSBox: dosbox_safe dosbox_excep

SteveFosdick commented 2 years ago

So the exception test is being fooled by the negative values which are sign-extended as 32 bit. Fixing that gives: b_em-fixed

hoglet67 commented 2 years ago

Great work on the test program Steve.

I've just re-run this on PiTubeDirect and on Matchbox and both give the same results as DOSBox above.

I just spotted this interesting corner case on DOSBOX-x that was fixed 3 days ago: https://github.com/joncampbell123/dosbox-x/commit/9be458ba0689d8c60c962808ead4cb11b99c6740

No idea what the 80186 behaviour would be in this case.

But it made me realize the overflow tests in PiTubeDirect are likely incorrect:

The 8-bit test is currently:

  // Check for overflow in the 8-bit quotient
  if (d1 & 0xFF00)
  {
    intcall86(0);
    return;
  }

The legal range of the quotient is -127 ... +127 but at this point the code is working with unsigned numbers, so really the test should be 0..127:

  // Check for overflow in the 8-bit quotient
  if (d1 & 0xFF80)
  {
    intcall86(0);
    return;
  }

Or maybe more readably:

  // Check for overflow in the 8-bit quotient
  if (d1 > 127)
  {
    intcall86(0);
    return;
  }

The same issue exists in IDIV, with the range +- 32767

Do you agree?

This is the bug that just keeps on giving!!!!

Dave

hoglet67 commented 2 years ago

The 80186 extends the range of negative numbers allowed as a quotient to include 8000H and 80H for the IDIV instruction

image

So this suggests the legal quotient range is -128 to +127

SteveFosdick commented 2 years ago

Following the diversion of dividing 0xFD28000 by 0x1FA5 to get 0x8000 I thought I'd see what a modern x86 processor does in this case. So I wrote this little assembler snippet to be assembled by gas:

    .text
    .globl  idiv_test
idiv_test:
    mov %rdi,%rax
    mov %rdi,%rdx
    shr $16,%rdx
    mov %rsi,%rbx
    idiv %bx
    shl $16,%rdx
    or %rdx,%rax
    ret

where confusingly the gas syntax has the two operands the opposite way around from Intel syntax. Then this C program to call it:

#include <stdio.h>
#include <stdint.h>

extern uint32_t idiv_test(int32_t dividend, int16_t divisor);

static void run_test(int32_t dividend, int16_t divisor)
{
    uint32_t res = idiv_test(dividend, divisor);
    unsigned quotient = res & 0xffff;
    unsigned remainder = (res >> 16) & 0xffff;
    printf("quotient=%d (%08X)\nRemainder=%d (%08X)\n", quotient, quotient, remainder, remainder);
}

int main(int argc, char **argv)
{
    run_test(0xFD28000-1, 0x1FA5);
    run_test(0xFD28000, 0x1FA5);
    return 0;
}

Linking these into a final program and running it gives:

$ ./idiv16 
quotient=32767 (00007FFF)
Remainder=8182 (00001FF6)
Floating point exception (core dumped)

so it seems while 0x8000 is a valid result on the more modern x86 family processors, it throws an exception if the sign of the value is changed, i.e. if the result really should have been +32768.

SteveFosdick commented 2 years ago

I have found a bug with the assembler routine - I was forgetting to mask out the upper bits of RAX before merging in DX. Here's the revised version:

    .text
    .globl  idiv_test
    .file "idiv16_asm.s"
idiv_test:
    mov %rdi,%rax
    mov %rdi,%rdx
    shr $16,%rdx
    mov %rsi,%rbx
    idiv %bx
    shl $16,%rdx
    and $65535,%rax
    or %rdx,%rax
    ret

I also had the idea of transcribing the test program to C to see what happens on modern hardware for the other test cases. This turned out to be fun and games as recovering from the SIGFPE does not appear to be possible. Using setjmp/longjmp to go from the signal handler back into a function to run the next test leaves the signal disposition reset to default so the next SIGFPE terminates the program. Calling sigaction after the longjmp re-invokes the signal hander. Returning from the signal handler re-runs the instruction causing the exception! So I had to use fork to run the tests:

#include <errno.h>
#include <stdio.h>
#include <stdint.h>
#include <string.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/wait.h>

extern uint32_t idiv_test(int32_t dividend, int16_t divisor);

static void run_test(int32_t dividend, int16_t divisor)
{
    static const char fmt[] = "%-10s %08X %10d\n";
    pid_t pid = fork();
    if (pid == 0) {
        printf(fmt, "Dividend", dividend, dividend);
        printf(fmt, "Divisor", divisor, divisor);
        uint32_t res = idiv_test(dividend, divisor);
        int32_t quotient = res & 0xffff;
        if (quotient & 0x8000)
            quotient |= 0xffff0000;
        int32_t remainder = (res >> 16) & 0xffff;
        if (remainder & 0x8000)
            remainder |= 0xffff0000;
        printf(fmt, "Quotient", quotient, quotient);
        printf(fmt, "Remainder", remainder, remainder);
        exit(0);
    }
    else if (pid == -1)
        fprintf(stderr, "idiv16: unable to fork: %m");
    else {
        int wstat;
        while (waitpid(pid, &wstat, 0) != pid && errno == EINTR)
            ;
        if (WIFSIGNALED(wstat)) {
            if (WTERMSIG(wstat) == SIGFPE)
                fputs("** EXCEPTION **\n", stdout);
            else
                fprintf(stderr, "idiv16: unexpected signal %s\n", strsignal(WTERMSIG(wstat)));
        }
        putchar('\n');
    }
}

int main(int argc, char **argv)
{
    /* expected to give results */
    run_test(1400,300);
    run_test(1400,-300);
    run_test(-1400,300);
    run_test(-1400,-300);
    run_test(467376,300);
    run_test(467376,-300);
    run_test(-467376,300);
    run_test(-467376,-300);
    run_test(0xFD28000-1, 0x1FA5);
    /* expected to cause an exception */
    run_test(467376,0);
    run_test(467376,3);
    run_test(0xFD28000, 0x1FA5);
    return 0;
}

and the results:

Dividend   00000578       1400
Divisor    0000012C        300
Quotient   00000004          4
Remainder  000000C8        200

Dividend   00000578       1400
Divisor    FFFFFED4       -300
Quotient   0000FFFC      65532
Remainder  000000C8        200

Dividend   FFFFFA88      -1400
Divisor    0000012C        300
Quotient   0000FFFC      65532
Remainder  0000FF38      65336

Dividend   FFFFFA88      -1400
Divisor    FFFFFED4       -300
Quotient   00000004          4
Remainder  0000FF38      65336

Dividend   000721B0     467376
Divisor    0000012C        300
Quotient   00000615       1557
Remainder  00000114        276

Dividend   000721B0     467376
Divisor    FFFFFED4       -300
Quotient   0000F9EB      63979
Remainder  00000114        276

Dividend   FFF8DE50    -467376
Divisor    0000012C        300
Quotient   0000F9EB      63979
Remainder  0000FEEC      65260

Dividend   FFF8DE50    -467376
Divisor    FFFFFED4       -300
Quotient   00000615       1557
Remainder  0000FEEC      65260

Dividend   0FD27FFF  265453567
Divisor    00001FA5       8101
Quotient   00007FFF      32767
Remainder  00001FA4       8100

Dividend   000721B0     467376
Divisor    00000000          0
** EXCEPTION **

Dividend   000721B0     467376
Divisor    00000003          3
** EXCEPTION **

Dividend   0FD28000  265453568
Divisor    00001FA5       8101
** EXCEPTION **

Some of these look familiar from what B-Em was doing previously. That makes sense given that there is a comment in B-Em's x86.c that it originally emulated an 80286.

hoglet67 commented 2 years ago

Nice work Steve.

Looking at the currently checked in version of x86.c in sf/issue161, I think the 8-bit IDIV overflow test is still wrong.

The test is:

if (d1 & 0xFF00) {

but it's done at the point where d1 is always positive, so it's effectively checking for a quotient range of -255 -> +255.

I think for the 80186 the legal range is -128 to +127

To achieve this, you need to move the test down in the code, after the sign of d1 has been corrected.

Dave

SteveFosdick commented 2 years ago

David, that doesn't seem to work. I added two new test cases to the test program:

  410 PROCtest(56,3,16447,-128)
  420 PROCtest(56,8,16256,127)

If I have got this right, the first should be valid because the quotient is -128 whereas the 2nd should throw an exception because the correct quotient is +128 and that is greater than the range of a signed 8bit, but it does not. Here is the result (top two in the right hand column): ss4

The committed code is :

                                        if (d1 & 0xFF00) {
                                            log_debug("x86: IDIV8 overflow at %04X:%04X, dividend(AX)=%04X, divisor=%02X", cs>>4, pc, AX, temp);
                                            x86_intcall(0);
                                        }
                                        else {
                                            // Correct the sign of the quotient.
                                            if (sign1 ^ sign2)
                                                d1 = (~d1 + 1) & 0xff;
                                            // Correct the sign of the remainder.
                                            if (sign1)
                                                d2 = (~d2 + 1) & 0xff;
                                            AH = d2;
                                            AL = d1;
                                        }

I changed this to:

                                        // Correct the sign of the quotient.
                                        if (sign1 ^ sign2)
                                            d1 = (~d1 + 1) & 0xff;
                                        // Correct the sign of the remainder.
                                        if (sign1)
                                            d2 = (~d2 + 1) & 0xff;
                                        AH = d2;
                                        AL = d1;
                                        if (d1 & 0xFF00) {
                                            log_debug("x86: IDIV8 overflow at %04X:%04X, dividend(AX)=%04X, divisor=%02X", cs>>4, pc, AX, temp);
                                            x86_intcall(0);
                                        }

In the case that the sign has been flipped, we won't see anything in the higher bits because the sign correction logic has masked them off but the now failing test case is using two positive numbers. I can't extend the bitmask to pick up bit 7 (0xff80) because that would reject all negative quotients so it has to involve the sign of the dividend and divisor.

I believe a positive quotient should result from a dividend and divisor whose sign is the same so this is the opposite to the case where we are correcting the sign. That means the code could become:

                                        // Correct the sign of the quotient.
                                        if (sign1 ^ sign2)
                                            d1 = (~d1 + 1) & 0xff;
                                        else if (d1 & 0xff80) {
                                            log_debug("x86: IDIV8 overflow at %04X:%04X, dividend(AX)=%04X, divisor=%02X", cs>>4, pc, AX, temp);
                                            x86_intcall(0);
                                        }
                                        // Correct the sign of the remainder.
                                        if (sign1)
                                            d2 = (~d2 + 1) & 0xff;
                                        AH = d2;
                                        AL = d1;

But that shows very clearly that we are not checking for the exception in the case that the sign of the dividend and divisor is opposite giving a negative quotient. Should that case test with a different mask, perhaps?

SteveFosdick commented 2 years ago

That corrected code does, however, give the expected exception for the case where the quotient is +128: ss5

hoglet67 commented 2 years ago

Yeh, sorry my brain wasn't working correctly when I made that suggestion....

The pattern you use in IDIV16 should work:

So something like:

if ((!(d1 & 0x80) && (d1 & 0xff00)) || ((d1 & 0x80) && (d1 & 0xff00) != 0xff00)) {

Which I think is equvalent to:

if (d1 > 0x7F && d1 < &ff80) {
hoglet67 commented 2 years ago

FYI, this is what I've ended up with in PiTubeDirect to catch the overflow cases. https://github.com/hoglet67/PiTubeDirect/commit/6b14320cd475c561e887cf4fc6dd2c4ee1b5d5bc

SteveFosdick commented 2 years ago

Dvaid, thanks. I have included that. I still don't think that resolves the issue of what to do when dividing a negative dividend by a negative divisor gives a quotient of +32768. More modern processors throw an exception but we could still do with learning if an 80186 would. I will post the latest test program and see if someone has a real co-pro to run it on.

hoglet67 commented 2 years ago

I think the case you are talking about is &C000000 / &FFFF8000

The link I posted earlier suggests a quotient of +32768 should throw an exception on the 80186: https://github.com/stardot/b-em/issues/161#issuecomment-960668332

I just tried the Matchbox and it does throw an exception in that case.

I don't think I'm aware of any case from any 80x86 where a wrong answer is returned fom IDIV instead of throwing an overflow exception. Are you?

It would definitely be worth posting the program on stardot and getting someone like BeebMaster to run it on a real 80186.

SteveFosdick commented 2 years ago

Thanks. Daniel has run a test on real hardware which behaves as you expected. I have pushed a version that now matches that and copies the detection from PiTubeDirect.

I am puzzled as to why my original test case doesn't throw the exception, though, but that is mainly for interest as real hardware also doesn't.