FDOS / freecom

FreeDOS Command Shell (command.com)
http://www.freedos.org/
GNU General Public License v2.0
158 stars 38 forks source link

Extended command line parameter bug #95

Closed shidel closed 6 months ago

shidel commented 6 months ago

Hi,

While doing some testing of other things, I've discovered a serious command line parameter bug. I've attached the simple program I was using for my other tests when I discovered the issue.

( For this test, I'm running FreeCOM under DOSBox Staging. If I just use the DOSBox command shell, the command line is simply truncated at 126 characters with a trailing CR )

Start a fresh instance of FreeCOM and insure that the last byte in the PSP for the command line [CS:0xff] is NOT 0x0d . Like in this photo the state was 0x00.

Screenshot_20240510_070630

Now provide and exhaustively long command line for the test program and you will see the byte remains unchanged.

Screenshot_20240510_070800

Shortening the parameters passed to the test program to exactly 126 bytes, FreeCOM will update the final character in the PSP to 0x0d.

Screenshot_20240510_072404

Subsequent parameter strings greater than 126 characters will leave the last PSP character as 0x0d.

Per all of the the documentation regarding the parameter string in the PSP, it must absolutely be terminated by a Carriage Return character. A program which does not use the Length byte and only terminates the string with 0x0d would have compatibility issues and could crash or lockup.

Furthermore, the length byte is 0x7f when there is over 127 characters provided for the programs parameters.

That seems wrong to me.

Because, it would cause programs that strictly rely on the length byte to end up including the CR in their string. Those programs would not be getting the full parameter string anyway.

However, some programs use the length byte to overwrite the CR with a NULL. This would cause the first byte of the program to be overwritten with the NULL. But, that code would have already be executed and it is extremely unlikely that a Jump to CS:0x100 exists in the program.

More seriously, a program may copy the PSP string using the length byte to a fixed 127 byte ASCIIZ buffer which would be a buffer overflow for the program.

On the plus side, the incorrect length byte does not effect any of my code. Those never use the length byte and terminate on 0x0d or 0x00. However i guess if another value like 0x65 was left in the last position in the PSP, some of them would continue thinking it was processing the command line until it reached an 0x0d or 0x0a which would result in a program thrown "invalid parameter" or "file not found" error.

BUGTEST.zip

boeckmann commented 6 months ago

Code affecting this: https://github.com/FDOS/freecom/blob/1d4f639e0bdcf7821d12d2bc773f99537682037f/lib/exec.c#L143-L156 MAX_EXTERNAL_COMMAND_SIZE is defined as 126.

I deduce the following from the source (have to verify it by testing):

If the else branch is taken (command line may be 126 bytes long), 0x0d, 0x00 gets appended, so this adds to 128 bytes. One longer than it should be and so possibly [0x100] of the executable gets overwritten by the binary zero.

In case command line is greater than 126 bytes, there is a off by one error. Idea is to set PSP[0xff] to 0x0d and PSP[0x80] to 127, for compatibility. But PSP[0x100] (not PSP anymore) is set to 0x0d, leaving PSP[0xff] as it is. The expression &buf[1] + buf[0] is 0x81 + 0x7f = 0x100.

andrewbird commented 6 months ago

Hi @shidel, you mentioned documentation for the parameter string in PSP, but I couldn't find anything but the wikipedia entry https://en.wikipedia.org/wiki/Program_Segment_Prefix, so perhaps I could trouble you to point me to it please? Many thanks!

shidel commented 6 months ago

Hi @shidel, you mentioned documentation for the parameter string in PSP, but I couldn't find anything but the wikipedia entry https://en.wikipedia.org/wiki/Program_Segment_Prefix, so perhaps I could trouble you to point me to it please? Many thanks!

I’m rather sure @boeckmann has fixed the problem. But to answer your question…

It’s just another way of referring to the parameters passed to a program from the command line. The wiki page in your link calls it “command line”. The length byte for the string is at offset 0x80 and the string is from 0x81 to 0xff.

personally I think calling that string “command line” is misleading. It does not contain the entire command line. It only has the parameters passed to the program.

ecm-pushbx commented 6 months ago

It is also called command line tail.

andrewbird commented 6 months ago

Hi shidel, thanks, yes I saw the PR. I'm interested in adding some command line tests to the Dosemu test suite, and I thought that maybe some documentation existed somewhere other than Wikipedia and this paragraph in RBIL

for 4DOS and Windows95, the command tail may be more than 126
      characters; in that case, the length byte will be set to 7Fh (with
      an 0Dh in the  127th position at offset FFh), and the first 126
      characters will be stored in the PSP, with the entire command line
      in the environment variable CMDLINE; under at least some versions
      of 4DOS, the byte at offset FFh is *not* set to 0Dh, so there is no
      terminating carriage return in the PSP's command tail.

Edit: oops wrong attributation

boeckmann commented 6 months ago

That command line passing under DOS is a minefield because of edge cases like the one @andrewbird posted above. Savest would be to always append 0x0d, 0x00 to prevent buggy programs from wreaking havoc. But that would restrict the length of the command line tail to 125 bytes. Could that be considered to be an option for FreeCOM? It still would have the ability to pass command line via CMDLINE env variable. But not every program has the ability to handle CMDLINE env. So probably it is better to leave it as-is.

andrewbird commented 6 months ago

Savest would be to always append 0x0d, 0x00 to prevent buggy programs from wreaking havoc.

Did you mean 0x00, 0x0d, as from what I'm seeing after testing a few DOSes is that each individual arg is terminated with 0x00 and then after all args are processed a 0x0d is appended (buffer limit respected of course). For example

TEST02____ 1234567890 123456

Generates this

54:45:53:54:30:32:5f:5f:5f:5f:00:31:32:33:34:35:36:37:38:39:30:00:31:32:33:34:35:36:00:0d

Personally I like the idea of being safe and consistent by limiting the command line tail to 126 bytes + 0x0d, where the those 126 bytes could be made up of single arg of 125 chars + 0x00 or two args of 99 chars + 0x00, 25 chars + 0x00 etc.

boeckmann commented 6 months ago

No, I mean 0x0d followed by 0x00. To my knowledge the command line tail is normally copied byte by byte to the PSP, leading to space 0x20 as the separator of the arguments.

How is the output you posted generated? Is it copied directly from the PSP? Is the PSP processed before outputting the bytes? A C runtime startup may alter the command line bytes as preparation to hand them over via argv to main(), replacing 0x20 by 0x00.

shidel commented 6 months ago

Screen Shot 2024-05-25 at 8 57 18 AM

@boeckmann at present, FreeCOM always appends an 0x00 when there is sufficient room in the PSP for the command line. It will leave it out if the parameter string exceeds 126 characters. I think that is a reasonable compromise.

@andrewbird you are definitely seeing a command line tail that has been modified by your program. Here is a link to latest version of the program I made that discovered the issue. It is written in assembly and performs no modifications to the parameters passed to it. It always displays the entire block allocated for the string. It also displays up to 26 bytes into the program area (over 0xff) if the CR is missing. It also displays the CMDLINE environment variable when present for command line parameters are over 126 characters.

Screen Shot 2024-05-25 at 9 19 09 AM

shidel commented 6 months ago

Ugh, I just had a though about this stuff. So I ran a quick check using my Pentium Pro. Which led to me finding a new issue. I'll start a new thread on it in a few minutes. I just need to fire up a VM to take a screen shot. :-(

ecm-pushbx commented 6 months ago

@andrewbird you are definitely seeing a command line tail that has been modified by your program. Here is a link to latest version of the program I made that discovered the issue. It is written in assembly and performs no modifications to the parameters passed to it. It always displays the entire block allocated for the string. It also displays up to 26 bytes into the program area (over 0xff) if the CR is missing. It also displays the CMDLINE environment variable when present for command line parameters are over 126 characters.

Screen Shot 2024-05-25 at 9 19 09 AM

Do you have the sources for this tool too?

shidel commented 6 months ago

@andrewbird you are definitely seeing a command line tail that has been modified by your program. Here is a link to latest version of the program I made that discovered the issue. It is written in assembly and performs no modifications to the parameters passed to it. It always displays the entire block allocated for the string. It also displays up to 26 bytes into the program area (over 0xff) if the CR is missing. It also displays the CMDLINE environment variable when present for command line parameters are over 126 characters. Screen Shot 2024-05-25 at 9 19 09 AM

Do you have the sources for this tool too?

Yes. It is a relatively simple test program that is located under a new library I am currently working on for that project and future projects. In the sources it is simply called test0001.asm

andrewbird commented 6 months ago

@andrewbird you are definitely seeing a command line tail that has been modified by your program.

It's a very minimal C program that's compiled by ia16-gcc, here's the salient part.

#include <i86.h>                                                                
#include <stdio.h>                                                              
#include <stdlib.h>                                                             
#include <string.h>                                                             

struct mypsp {                                                                  
  char pad[0x80];                                                               
  unsigned char len;                                                            
#define PSP_MAXCMDLINE 0x7f                                                     
  char buf[0x7f];                                                               
  unsigned char code[2];                                                        
} __attribute__((packed));                                                      

int main(int argc, char *argv[])                                                
{                                                                               
  struct mypsp __far *x = MK_FP(_psp, 0);                                       

  if (argc < 2) {                                                               
    printf("Need at least one argument\n");                                     
    return 1;                                                                   
  }                                                                             

  if (x->code[0] == 0x00 || x->code[0] == 0x0d ||                               
      x->code[1] == 0x00 || x->code[1] == 0x0d) {                               
    printf("%.6s (code following PSP was corrupted) [%02x:%02x]\n", x->buf, x->code[0], x->code[1]);
    return 1;                                                                   
  }                                                                             

#if defined(FIRST30)                                                            
  {                                                                             
    // Print the first 30 bytes                                                 
    int i;                                                                      

    printf("%.6s ", x->buf);                                                    
    printf("(% 3u)[%02x", x->len-1, x->buf[0]);                                 
    for (i = 1; i < 30; i++)                                                    
      printf(":%02x", x->buf[i]);                                               
    printf("]\n");                                                              
    return 0;                                                                   
  }                                                                             
#endif

<snip>

Here is a link to latest version of the program I made that discovered the issue. It is written in assembly and performs no modifications to the parameters passed to it.

That's probably where I'm going wrong by not using assembly, I suspect that it's probably the C compiler's startup code that's messing with the command tail.

So please disregard any assertions I've made in this thread, I'll have to start again, sigh...

boeckmann commented 6 months ago

@boeckmann at present, FreeCOM always appends an 0x00 when there is sufficient room in the PSP for the command line. It will leave it out if the parameter string exceeds 126 characters. I think that is a reasonable compromise.

@shidel Yes, it is the edge case where the zero does not fit into the PSP which made me think if it is perhaps better to restrict to 125 bytes and instead always append the zero. But dropping the zero this seems to be common behaviour since decades, so its probably perfectly fine.