dmsc / emu2

Simple x86 and DOS emulator for the Linux terminal.
GNU General Public License v2.0
395 stars 29 forks source link

Support for Truename (Int 21/AH=60h) #50

Open tsupplis opened 2 years ago

tsupplis commented 2 years ago

Would you be open to add this function? I am happy to provide trial implementation and tests

AH = 60h
DS:SI -> ASCIZ filename or path
ES:DI -> 128-byte buffer for canonicalized name

Return:
CF set on error
AX = error code
02h invalid component in directory path or drive letter only
03h malformed path or invalid drive letter
ES:DI buffer unchanged
CF clear if successful
AH = 00h or 3Ah (DOS 6.1/6.2 for character device)
AL = destroyed (00h or 2Fh or 5Ch or last character of current
directory on drive)
dmsc commented 2 years ago

Hi!

If you have a program that uses it, so it can be tested, sure, I can implement it.

Have Fun!

tsupplis commented 2 years ago

Test tool compiled with Borland C++ 2.0.

test.zip

#include <stdio.h>
#include <stdlib.h>
#include <dos.h>

int main(int argc, char ** argv) {
    union REGS regs;
    struct SREGS segs;
    char buffer[128+1];
    int rc;

    if(argc<2) {
        fprintf(stderr,"ERR: wrong command line\n");
        exit(-1);
    }
    memset(&regs,0,sizeof(regs));
    memset(&segs,0,sizeof(segs));
    regs.h.ah=0x60;
    regs.x.si=FP_OFF(argv[1]);
    segs.ds=FP_SEG(argv[1]);
    regs.x.di=FP_OFF(buffer);
    segs.es=FP_SEG(buffer);
    fprintf(stderr,"INF: calling int\n");
    intdosx(&regs,&regs,&segs);
    fprintf(stderr,"INF: cf=%04X\n",regs.x.cflag);
    fprintf(stderr,"INF: ax=%04X\n",regs.x.ax);
    if(!regs.x.cflag) {
        fprintf(stderr,"INF: %s\n",buffer);
    } else {
        fprintf(stderr,"INF: failed\n");
    }
    return 0;
}
./emu2 test.com a:\\test\\foo
INF: calling int
INF: cf=0000
INF: ax=0000
INF: a:\test\foo
./emu2 test.com c:\\test\\foo
INF: calling int
INF: cf=0000
INF: ax=0000
INF: c:\test\foo
./emu2 test.com c:\\test\\foolllllllllllllllll\\aaaaaaa\\bbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb
INF: calling int
INF: cf=0001
INF: ax=6000
INF: failed
EMU2_CWD=src ./emu2 test.com c:test\\foo
INF: calling int
INF: cf=0000
INF: ax=0000
INF: c:\src\test\foo
EMU2_CWD=src ./emu2 test.com test\\foo
INF: calling int
INF: cf=0000
INF: ax=0000
INF: c:\src\test\foo
EMU2_CWD=src ./emu2 test.com foo
INF: calling int
INF: cf=0000
INF: ax=0000
INF: c:\src\foo
EMU2_CWD=src ./emu2 test.com c:\\test\\foo
INF: calling int
INF: cf=0000
INF: ax=0000
INF: c:\test\foo
EMU2_CWD=src ./emu2 test.com d:\\foo
INF: calling int
INF: cf=0000
INF: ax=0000
INF: d:\foo

There are differences with the DOS version:

dmsc commented 2 years ago

Hi!

  • The value of output AH changes (value return is 0x5C as dos 3.3)

Tested in DOS 3.3, the return value for AL is sometimes 0x53, but not always: image

Also, DOS 3.3 keeps the trailing backslash if given: image

This is why I would like to have an actual program using this call, so we could verify what is the expected result from old DOS programs that used it.

In the meantime, I added two commits to the pull request, with minor fixes.

Have Fun!

tsupplis commented 2 years ago

Super thank you, working on the samples. My main focus is binary self identification in compiler runtime, c in particular (argv[0] translation) as the recipe below https://stackoverflow.com/questions/53570837/full-path-to-self-in-dos-executable Old compilers put either hardcoded name or psp extracted names. I will look at the inventory of the compilers I use

dmsc commented 2 years ago

Hi!

My main focus is binary self identification in compiler runtime, c in particular (argv[0] translation) as the recipe below https://stackoverflow.com/questions/53570837/full-path-to-self-in-dos-executable Old compilers put either hardcoded name or psp extracted names. I will look at the inventory of the compilers I use

The answer in that question is correct, you should extract the program name from the environment (just after all the environment variables). But this is actually fully documented in the DOS reference - from "dosref33", about function 4Bh, code 0:

The loaded program also receives an environment, a series of ASCIZ strings of the form parameter=value (for example, verify = on). The environment must begin on a paragraph boundary, be less than 32K bytes long, and end with a byte of 00H (that is, the final entry consists of an ASCIZ string followed by two bytes of 00H). Following the last byte of zeros is a set of initial arguments passed to a program containing a word count followed by an ASCIZ string. If the call finds the file in the current directory, the ASCIZ string contains the drive and pathname of the executable program as passed to Function 4BH. If the call finds the file in the path, it concatenates the filename with the path information. (A program may use this area to determine whence it was loaded.)

So, the program name should be get from the environment, and the path extracted from there. Using truename is not useful - and at least neither borland and microsoft runtimes do that.

I extracted the "dosref33" file from the Microsoft Bookshelf, I converted it to HTML using a custom program, attached is the full file. dosref33.zip

tsupplis commented 2 years ago

Actually you are correct, and I found contradicting notes on it being internal. Should we drop it?

dmsc commented 2 years ago

Given that the work is mostly down, I will push an implementation - but I will return capitalized results and drop the setting of AX register.