dosemu2 / fdpp

FreeDOS plus-plus, 64bit DOS
GNU General Public License v3.0
196 stars 17 forks source link

Share file open semantics not identical to MS-DOS 6.22 #135

Closed andrewbird closed 4 years ago

andrewbird commented 4 years ago

I've been testing the share file open semantics, initially on MS-DOS 6.22 with FAT, and can confirm that the matrix in RBIL is correct. In all there are 225 tests and I do find that FDPP's internal share (and FreeDOS 1.20 obviously) does have some differences from MS-DOS 6.22. I ran the tests on travis here https://travis-ci.com/github/andrewbird/dosemu2/builds/160893660 and you can see they each have 10 divergences.

The DOS test program opens a file with the primary share and open modes, then spawns a copy of itself that tries to open it again with the secondary share and open modes. It then compares the result with the expected one and emits pass or fail.

These are the differences with current FDPP to MS-DOS 6.22 / RBIL

 FAIL:('SH_COMPAT', 'R', 'SH_DENYWR', 'R', 'DENY')[secondary allowed]
 FAIL:('SH_COMPAT', 'R', 'SH_DENYNO', 'R', 'DENY')[secondary allowed]
 FAIL:('SH_DENYWR', 'R', 'SH_COMPAT', 'R', 'INT24')[secondary allowed]
 FAIL:('SH_DENYWR', 'W', 'SH_DENYRD', 'R', 'ALLOW')[secondary denied]
 FAIL:('SH_DENYWR', 'W', 'SH_DENYNO', 'R', 'ALLOW')[secondary denied]
 FAIL:('SH_DENYWR', 'RW', 'SH_DENYNO', 'R', 'ALLOW')[secondary denied]
 FAIL:('SH_DENYRD', 'W', 'SH_DENYRD', 'W', 'ALLOW')[secondary denied]
 FAIL:('SH_DENYNO', 'R', 'SH_COMPAT', 'R', 'INT24')[secondary allowed]
 FAIL:('SH_DENYNO', 'R', 'SH_COMPAT', 'W', 'INT24')[secondary allowed]
 FAIL:('SH_DENYNO', 'R', 'SH_COMPAT', 'RW', 'INT24')[secondary allowed]

So as an example the first line shows:

I've added the test program and batch file to just do the failing tests here t2.zip. It should be run with a FAT image installed at drive D:

Note that RBIL does show that some combinations are supposed to result in an int24, but I never see any of those and the secondary open is denied, so maybe it's possible that DJGPP is handling them?

stsp commented 4 years ago

Wow, great work! How about also rename/delete tests? As for int24, you can find the lines starting with call_intr(0x24 in share.c. You can use fdloudprintf() in fdpp code, and it will print to terminal and to dosemu log (with -D+f). That way you can find out if share.exe at least tries to activate int24. Also the code in share.c is small and simple, so can you think of a fix? :)

andrewbird commented 4 years ago

Yes rename/delete tests would be good, as I'd like to see if having an open file does deny another process deleting it. I wonder if there's any document that details what the expected behaviour is, or maybe it's a matter of testing all options against MS-DOS 6.22?

Also the code in share.c is small and simple, so can you think of a fix? :)

Yep I'll have a look, as fdloudprintf() looks to be very useful in determining whether the incorrect action is caused by the exceptions or the general rule in share.c.

I'll test out the int24 stuff if I get the first part fixed.

One thing just occurred to me though, perhaps I'm wrong in comparing it against DOS 6.22 as the source mentions DOS 7. RBIL has DOS 7 supporting another field for access time update, but maybe the other fields changed?

stsp commented 4 years ago

Seems like they did. Eg "Deny None" on R had "2 C C" and on DOS7: "Y Y Y Y". Strange.

andrewbird commented 4 years ago

Okay thanks. I've written a revised test based on RBIL's definition for DOS 7 before doing any changes. The results for FDPP are better, but still not quite correct (only 6 failures now)

Test PP-DOS-GIT FAT DOSv3 share open v7                                          ... FAIL (one or more subtests)
    (t=('SH_COMPAT', 'W', 'SH_DENYNO', 'R', 'ALLOW'))                            ... FAIL
    (t=('SH_COMPAT', 'RW', 'SH_DENYNO', 'R', 'ALLOW'))                           ... FAIL
    (t=('SH_DENYWR', 'W', 'SH_DENYRD', 'R', 'ALLOW'))                            ... FAIL
    (t=('SH_DENYWR', 'W', 'SH_DENYNO', 'R', 'ALLOW'))                            ... FAIL
    (t=('SH_DENYWR', 'RW', 'SH_DENYNO', 'R', 'ALLOW'))                           ... FAIL
    (t=('SH_DENYRD', 'W', 'SH_DENYRD', 'W', 'ALLOW'))                            ... FAIL

I'm not quite sure whether RBIL's definition is for PC-DOS 7.00 or Win95, but I'd guess at the latter (or both!). Will add a Win95 testcase and test against it later.

stsp commented 4 years ago

process deleting it. I wonder if there's any document that details what the expected behaviour is, or

I would guess that delete is equal to Deny None/W. But this is just a wild guess. :)

stsp commented 4 years ago

Also RBIL says:

        deleting a file which is currently open may lead to filesystem
          corruption.  Unless SHARE is loaded, DOS does not close the handles
          referencing the deleted file, thus allowing writes to a nonexistant
          file.
        under DR DOS and DR Multiuser DOS, this function will fail if the file
          is currently open

I think we should do the same. Thee is no need to check locking, if that will only lead to an fs corruption. We just need to fail the call anyway, whether or not the file is locked.

andrewbird commented 4 years ago

We just need to fail the call anyway, whether or not the file is locked.

I assume you mean for MFS? Failing on an open file is good for a single invocation of Dosemu (including my test case), but I think we still need some variation on the current lock based implementation of share to protect against multiple Dosemu instances (the clipper test case).

For now I'm just concentrating on the share/FAT, as the same test (share v6) against MFS was very bad. There are comments in mfs.c that apparently show the true share semantics, they aren't even close to those when running on FAT, and I'm sure we will need to store the primary share/open modes in order to correctly determine secondary access. Your xattr idea sounds good, although I'm a little concerned about leaving stale ones behind if Dosemu crashes out etc.

stsp commented 4 years ago

(including my test case), but I think we still need some variation on the current lock based implementation of share to protect against multiple Dosemu instances (the clipper test case).

But what share does protect the delete operation? In the one in fdpp, I can't find anything about delete. Also do you mean the delete semantic should differ between fat and redirector? And why failing the delete would not fix the clipper test case?

concerned about leaving stale ones behind if Dosemu crashes out etc.

Yes, given that xattr has a very limited space, stalled entries are a concern indeed.

andrewbird commented 4 years ago

But what share does protect the delete operation? In the one in fdpp, I can't find anything about delete.

It's on my list of things to test in MS-DOS 6.22 share.exe, but no I didn't see anything to prevent deletion in FDPP/FreeDOS share code.

Ahh I see now you were talking about share/FAT, I apologise for going off topic about MFS in FDPP ticket. So in that case failing to delete/rename any open file would be fine. You don't have to worry about two dosemu instances (the clipper test) because you should never open the same FAT image file in both at the same time.

(the following probably belongs in https://github.com/dosemu2/dosemu2/issues/270, not FDPP ticket) In the MFS scenario with its internal share the clipper delete testcase is difficult. Consider the first Dosemu instance opens the file, and a second instance wants to rename or delete it. The second instance can't know the first has the file open unless it's placed a lock on the file. That's how my odd open/getlk/close then delete came about. I had also considered that we might be able to encode the primary open/share modes using different byte offsets, but for share we need those locks to be exclusive i.e. write and that requires the unix file be opened r/w or w. I also need to retest the DOS share/locking on readonly files, as it seems to me that unix locks placed on readonly files can only be non-exclusive, and fcntl(F_GETLK) only returns the first conflicting lock, so I wonder if the protection is ineffective in that case.

stsp commented 4 years ago

But is delete file only affected by deny modes, or also by the region locks, http://www.ctyme.com/intr/rb-3016.htm Do we have a test for 5c, does it currently work properly?

I had also considered that we might be able to encode the primary open/share modes using different byte offsets

You mean, some artificial regions that never collide with 5c? This is interesting IMHO only if you can solve it combinatorially. :) Can you choose the offsets and lengths the way that they actually collide only when needed? And yes, these should only be write locks, but maybe with the read locks at fixed offsets you can store some additional info... Interesting. :) But even if you manage to solve the combinatorial part, I wonder who will understand such code later.

andrewbird commented 4 years ago

Do we have a test for 5c, does it currently work properly?

Yes I recently added a simple test for locking twice which uses this djgpp function. It seems to work okay, but I only tested with a R/W opened file, I'll need to retest with RO file (see last paragraph). As yet I haven't created a modified test to try the primary process lock, secondary process delete/rename procedure.

You mean, some artificial regions that never collide with 5c?

Yes that's what is done at present to implement share deny mode, see share() which uses a lock of a single byte at offset 0x100000000LL which I believe is meant to be beyond the range DOS can place a lock on a file.

Can you choose the offsets and lengths the way that they actually collide only when needed?

Not sure, but I'd want to minimise the number of locks placed and tests to interrogate.

And yes, these should only be write locks

The problem I see with that is that it means all files opened by dosemu would have to be opened in Unix for write or read/write, but don't we currently translate DOS file perms <-> Unix? This is why I'm going to test if share currently works on Unix read only files

stsp commented 4 years ago

So... if you use an OFD locks, and don't want to collide with record locks, and want W perm, then maybe simply open one extra fd for locks? Not one extra fd per DOS fd, but one per mfs-opened file. Seems like files are already ref-counted by sft_handle_cnt(sft).

andrewbird commented 4 years ago

then maybe simply open one extra fd for locks?

That's a nice way of avoiding the DOS opened the file for read issue! I still have the issue with Unix permissions though, say if the file was '-r--r--r--' I wouldn't be able to open it for write in order to place the lock.

Anyway before I go off down a rabbit hole, I'll work on the FDPP FAT/share issue.

stsp commented 4 years ago

permissions though, say if the file was '-r--r--r--' I wouldn't be able to open it for write

The only thing I can think of here, is to map the group permissions to DOS, instead of the user permissions, leaving user permissions to RW always. We already have the if (read_only(drives[drive])) checks in OPEN_EXISTING and friends, so you'll only need to extend that check to see if the file itself is read-only on a group level...

This may sound crazy from the first glance, yes. But look at that differently: no matter what DOS does, from the linux side you'll always be able to write to your files. And that seemingly makes some sense. :)

stsp commented 4 years ago

This will probably help also with xattrs, or whatever else weird thing you try.

andrewbird commented 4 years ago

Does anybody (@jschwartzenberg @stsp) have a copy to hand of share.exe that comes with Windows 95. [Version 4.00.950], as I need to validate the RBIL share (v7) description? It's not unpacked on the installation disks, and Win95 doesn't want to install on a network(mfs) drive. I tried installing Win95 onto an HD image, but after the first floppy disk the installation stops.

andrewbird commented 4 years ago

No worries, I've extracted the .CABs by hand and now have it.

andrewbird commented 4 years ago

Interestingly Windows 95. [Version 4.00.950] has RBIL's v6 semantics. I am only able to test in real DOS mode, but I suppose it could be different in a windowed environment...

stsp commented 4 years ago

You mean, opposite to freedos share?

andrewbird commented 4 years ago

No, freedos share doesn't match either RBIL's v6 or v7 definition, it has 10 and 6 failures respectively. MS-DOS 6.22 & 7.00 both match RBIL's v6. As yet I've not found anything that matches RBIL's v7.

stsp commented 4 years ago

Hm, wasn't there a 7.1 or even 8.0 DOS (like with WinME)?

andrewbird commented 4 years ago

Yep 7.1 and 8.0, though I did read some where that share was support by a vxd on those. I'll investigate 7.10 first, as there must be a reason RBIL calls it v7.

andrewbird commented 4 years ago

I can't find share.exe on the win95b iso (within the ,CABs). This article suggests that it's part of a .VXD instead. So testing that is likely to be more difficult for me.

stsp commented 4 years ago

Isnt it safest to just stick to v6 then? And smaller tables. :)

stsp commented 4 years ago

permissions though, say if the file was '-r--r--r--' I wouldn't be able to open it for write

How about putting the DOS r/o attr to xattrs?

andrewbird commented 4 years ago

Isnt it safest to just stick to v6 then?

Yes, that's fine by me.

How about putting the DOS r/o attr to xattrs?

Not sure there's a good solution here, that would be unexpected behaviour if Dosemu didn't follow unix file perms.

I wondered if we might do something like this with your write fd used purely for locking idea:

   fd = open(name, usual perms)
   if (fd >= 0) {
      fdlock = open(name, RW);
      if (fdlock < 0) {
         if (fstat(fd, &st) < 0)
            goto done;
         old_perms = st.st_mode & 07777;
         if (fchmod(fd, old_perms | S_IWUSR) < 0) {
            warn("couldn't add temporary owner write bit");
            goto done;
         }
         fdlock = open(name, RW);
         if (fchmod(fd, old_perms) < 0)
            warn("couldn't restore original file perms")
      }
done:
      if (fdlock < 0) {
         warn("couldn't set exclusive lock on file");

So there's a little window where we give ourselves write perms, open the file for our lock, then restore. It's probably bad practice, but I don't see another way around it?

Edited: removed fragment :-)

stsp commented 4 years ago

Not sure there's a good solution here, that would be unexpected behaviour if Dosemu didn't follow unix file perms.

It will follow unix perms, I just suggest perms to be RW. :) Lets see what we have:

By using xattrs, we make the mapping really sane, one, and two: we make it much similar to vfat_ioctl(), which also gets them from "something else". So we know that technique works, because right now it works for fatfs.

OTOH what you propose, is not only a race. Another problem with chmod is "read implies write", i.e. even the R/O open results in a dentry write! What if we have an RO fs?

So, while I think using xattr here is fine, someone have to verify if it is. But for sure I can say that chmod is not fine. Of course we can try the heavy-handed lockdb approaches or alike, but why not to try the simple things first?

andrewbird commented 4 years ago

I wrote a open file delete test for share on FAT. With MS-DOS 6.22 it behaved as I expected, that is: ShareMode, OpenMode, ExpectedResult

        ("SH_COMPAT", "R" , "DENY"),
        ("SH_COMPAT", "W" , "DENY"),
        ("SH_COMPAT", "RW", "DENY"),

        ("SH_DENYRW", "R" , "DENY"),
        ("SH_DENYRW", "W" , "DENY"),
        ("SH_DENYRW", "RW", "DENY"),

        ("SH_DENYWR", "R" , "DENY"),
        ("SH_DENYWR", "W" , "DENY"),
        ("SH_DENYWR", "RW", "DENY"),

        ("SH_DENYRD", "R" , "DENY"),
        ("SH_DENYRD", "W" , "DENY"),
        ("SH_DENYRD", "RW", "DENY"),

        ("SH_DENYNO", "R" , "DENY"),
        ("SH_DENYNO", "W" , "DENY"),
        ("SH_DENYNO", "RW", "DENY"),

But with FDPP Git, and FreeDOS 1.20, all deletions succeeded when they should have been denied.

stsp commented 4 years ago

Then this is an fdpp bug.

stsp commented 4 years ago

Ah, we are already in fdpp. :) Was confused with "dosemu2" in an url.

stsp commented 4 years ago

Is it a business of share.exe, right?

stsp commented 4 years ago

Does MS-DOS call maybe 0xa2 of int2f/10h before deletion?

andrewbird commented 4 years ago

Is it a business of share.exe, right?

Yes, without share MS-DOS 6.22 will happily delete open files, see


C:\>set LFN=n

C:\>d:

D:\>c:\shardelt primary SH_COMPAT R DENY
FAIL:('SH_COMPAT', 'R', 'DENY')[secondary allowed]

D:\>c:\share

D:\>c:\shardelt primary SH_COMPAT R DENY
PASS:('SH_COMPAT', 'R', 'DENY')[secondary denied]

Does MS-DOS call maybe 0xa2 of int2f/10h before deletion?

I only see one int2f/10 call, and I don't know the 0xa2 you speak of?

$ grep 'int 0x2f, ax=0x10' test.log
int 0x2f, ax=0x1000
stsp commented 4 years ago

int 0x2f, ax=0x1000

This is just an installation check - 0 in AL. 0xa2 in AL means access check, etc, see handler2f() in share.c. So you say it only works with share, but at the same time you log no share calls but the install check? Puzzled.

andrewbird commented 4 years ago

Interesting that RBIL only lists the int 0x2f, ax=0x1000, I wonder if the rest is implementation dependant? I seem to remember that the share.{com,exe} used has to match the kernel. I also notice that the MS-DOS share test runs about twice as fast as FDPP/FreeDOS 1.20. Perhaps there's some internal shortcut that avoids the interrupt and speeds things up.

stsp commented 4 years ago

I wonder if the rest is implementation dependant?

Maybe so, but if you log no share calls at all, that would barely explain anything. :)

that the MS-DOS share test runs about twice as fast as FDPP/FreeDOS 1.20.

IIRC freedos does something very bad with fatfs on start, like reading 65536 sectors or something like that. I know that because the initial versions of fdpp had their farptr caches completely polluted, and even dir worked like 20 seconds. I had to optimize fdpp as hell exactly because of this nice bug (or is it a feature). Maybe we should now finally dig it and fix the fatfs impl?

andrewbird commented 4 years ago

I tried the same test on FreeDOS 1.20, and although there are more installation checks, I still see no calls to a2 etc. Perhaps the logging is not capturing all the interrupt calls?

$ grep 'int 0x2f, ax=0x10' test.log
int 0x2f, ax=0x1000
int 0x2f, ax=0x1000
int 0x2f, ax=0x1000
int 0x2f, ax=0x1000
int 0x2f, ax=0x1000
int 0x2f, ax=0x1000
int 0x2f, ax=0x1000
int 0x2f, ax=0x1000
int 0x2f, ax=0x1000
int 0x2f, ax=0x1000
int 0x2f, ax=0x1000
int 0x2f, ax=0x1000
int 0x2f, ax=0x1000
int 0x2f, ax=0x1000
stsp commented 4 years ago

With freedos you won't see them, as its obviously doesn't work properly in freedos. But we need to see them in MS-DOS to confirm the claim that its a share.exe business.

andrewbird commented 4 years ago

So I'm not really understanding why you expect to see int2f/0x10a? calls on MS-DOS 6.22 as RBIL doesn't have them listed, only the installation check. The code from FreeDOS share now in FDPP has this statement.

                        /* These subfuctions are nonstandard, but are highly    
                           unlikely to be used by another multiplex TSR, since  
                           our multiplex Id (0x10) is basically reserved for    
                           SHARE.  So we should be able to get away with using  
                           these for our own purposes. */                       

Since my test above showed that MSDOS 6.22 will happily delete an open file when share isn't loaded and will deny its deletion when it is, I surmise that the DOS version of share must use another mechanism to access the share code once it's confirmed to be loaded.

I think we don't see those calls on either FDPP or FreeDOS because currently the share tables aren't checked for delete or rename(presumably), as you suggested earlier.

andrewbird commented 4 years ago

I added a travis test in dosemu draft PR https://github.com/dosemu2/dosemu2/pull/1133

stsp commented 4 years ago

I am not saying that we should see exactly that call, but some share call must be seen? Do you mean that delete is not affected by any locking, but just by the mere fact that the file is open? And do DOS checks things on its own, but only if share is installed? Almost a conspiracy theory... but nothing is impossible in dos. :)

stsp commented 4 years ago

If you think only the installed check is needed, then can you test with MS-DOS but share.exe from freedos? It will reply to installation check, and we'll see if deletes are blocked or not.

andrewbird commented 4 years ago

Do you mean that delete is not affected by any locking, but just by the mere fact that the file is open?

From what I'm seeing if the file is open and share is loaded all requests to delete it are denied, no matter what share mode was used when it was opened. As yet I haven't tried anything with locking, but I think it's a moot point as the file must be open to apply the lock, but I'm not sure if DOS removes all locks on close?

And do DOS checks things on its own, but only if share is installed?

It's quite possible that having share loaded just changes the behaviour of delete/rename etc, especially if I'm right that any open file is protected, as there's no need to consult the share access tables.

can you test with MS-DOS but share.exe from freedos? It will reply to installation check, and we'll see if deletes are blocked or not.

Nice idea, I'll try it. I think though that the share.exe TSR init is probably changing a kernel flag (at some well known location) rather than calling the 'share installation check', because I only see one call to the check, and I try to delete 15 files.

andrewbird commented 4 years ago

So I tried using share.com from FreeDOS 1.20 with MS-DOS 6.22 and the delete was not denied with the file open.


C:\>d:

D:\>rem c:\share.exe

D:\>c:\share.com
SHARE installed.

D:\>c:\shardelt primary SH_COMPAT R DENY
FAIL:('SH_COMPAT', 'R', 'DENY')[secondary allowed]

and grepping through the log shows the installation check was called only once. That's very early on in the log file and immediately afterwards you can see interrupt vectors being set so I think that's FreeDOS's preinstallation check, rather than MS-DOS's check at file deletion.

grep 'int 0x2f, ax=0x10' test.log
int 0x2f, ax=0x1000
andrewbird commented 4 years ago

This is an interesting read https://www.geos-infobase.de/ND_DOCS/236.HTM

stsp commented 4 years ago

But I never said the installation check would happen on every deletion! :) Likewise I didn't say it should be a2. But I think some share call should be done. Especially if just replying to 0x1000 is not enough.

andrewbird commented 4 years ago

Undocumented DOS says this pg192 image

pg 490 image

pg 491 image

stsp commented 4 years ago

OK, I see, so we can't see its internal calls. :( But we can invent our own.

andrewbird commented 4 years ago

But we don't need to, the FreeDOS's share code has them implemented, as you said(int2f/0x10a?). I think we've confused things a little, as all we needed to confirm was that loading share changed the delete behaviour on MS-DOS 6.22, which I think we've done. The mechanism it used was an interesting aside, but not really necessary to the cause. I coupled up the FDPP Delete to check in the same way as Open, see #135 which fixes both delete with path and fcb. I suspect a similar thing could be done with Rename.