php / php-src

The PHP Interpreter
https://www.php.net
Other
37.95k stars 7.73k forks source link

directory functions doesn't support IO_REPARSE_TAG_LX_SYMLINK reparse points (Unix/WSL symlinks) #15588

Open llde opened 3 weeks ago

llde commented 3 weeks ago

Description

builtin (extension "standard") function is_file and is_dir doesn't work properly if the path I'm trying to use these files on contains a unix symlink (IO_REPARSE_TAG_LX_SYMLINK reparse points).

These functions, before trying to detect if the path point to a file or directory try to get the real path of the path using the function tsrm_realpath_r inside zend_vritual_cwd.c where t traverse all path component backwards, opening all encountered paths. If one of these paths get a FILE_ATTRIBUTE_REPARSE_POINT attribute, the function try to resolve the reparse point using FSCTL_GET_REPARSE_POINT DeviceIOControl. It breaks here as the returned type IO_REPARSE_TAG_LX_SYMLINK isn't handled, exiting the function.

This is a serious issue with "wine" as all native symlinks are treated as IO_REPARSE_TAG_LX_SYMLINK, and I confirmed that treating the resulting symlink as IO_REPARSE_TAG_SYMLINK is correcting the issue. It's however a risky change on wine side, as both symlinks can have a different implementation on filesystem, and I'm not sure if is always transparent to the application.

tsrm_realpath_r should support IO_REPARSE_TAG_LX_SYMLINK (support should be more trivial then IO_REPARSE_TAG_SYMLINK, but I'm also finding strange that these functions do traverse the path in this way. It doesn't seem to add some value at first glance, at least for paths that contains symlinks, but don't end in symlinks themselves.

I'm bound to using wine here, as I'm using an application Windows only, and it use php for various operations. But the way the application is starting php don't allow me to replace the windows version with a native linux version.

Resulted in this output (call done inside the interactive shell):

php > $path = "C:\\users\\lorenzo\\Scrivania\\Fanbox";
php > $ret =  is_dir($path);
php > var_dump($ret);
bool(false)

Scrivania is a symlink to Z:\home\lorenzo\Desktop

But I expected this output instead:

php > $path = "C:\\users\\lorenzo\\Scrivania\\Fanbox";
php > $ret =  is_dir($path);
php > var_dump($ret);
bool(true)

PHP Version

PHP 8.2.7

Operating System

ArchLinux (WINE Staging 9.15)

cmb69 commented 3 weeks ago

Changing to Category:Engine instead of Extension:standard since it's solely about code in zend_virtual_cwd.c.

cmb69 commented 1 week ago

I've prepared a file foo and a symlink bar (ln -s foo bar). Then in a cmd.exe console:

$ type foo
hello

$ type bar
Das System kann auf die Datei nicht zugreifen.

$ fsutil reparsePoint query bar
Analysenkennungswert : 0xa000001d
Kennungswert: Microsoft
Kennungswert: Namenersatz

Analysedatenlänge: 0x7
Analysedaten:
0000:  02 00 00 00 66 6f 6f                              ....foo

So IO_REPARSE_TAG_LX_SYMLINK is reported by fsutil, but Windows doesn't follow that symlink. Therefore, we should not allow following such symlinks by default for security reasons.

Now I just changed IO_REPARSE_TAG_SYMLINK to IO_REPARSE_TAG_LX_SYMLINK in zend_virtual_cwd.c, but that won't work (Failed to open stream: No such file or directory). The problem is that DeviceIoControl(hLink, FSCTL_GET_REPARSE_POINT, NULL, 0, pbuffer, …) doesn't fill pbuffer as it would for IO_REPARSE_TAG_SYMLINK, i.e. pbuffer->SymbolicLinkReparseBuffer is useless, so we would need to access pbuffer->GenericReparseBuffer.ReparseTarget which contains the data shown by fsutil. However, the Microsoft documentation about IO_REPARSE_TAG_LX_SYMLINK appears to be sparse, so I'm not even sure how to interpret the raw reparse data.

llde commented 1 week ago

They have apparently a separate format. You could check wine-staging ntdll-Junction_Points patchset like here: https://github.com/wine-staging/wine-staging/blob/master/patches/ntdll-Junction_Points/0024-ntdll-Add-support-for-creating-Unix-Linux-symlinks.patch In wine headers it's defined as: struct { ULONG Version; UCHAR PathBuffer[1]; } LinuxSymbolicLinkReparseBuffer; You can also choose to ignore this kind of reparse points, treating them like a normal path, avoiding the error path for unknown reparse points. From my tests, this will also be an acceptable solution.

Another way to tackle the initial problem (about is_dir and similar functions) is to avoid traversing the path at all, for these functions. But I'm not sure why it was done this way, maybe as protection for path traversals?

cmb69 commented 1 week ago

Another way to tackle the initial problem (about is_dir and similar functions) is to avoid traversing the path at all, for these functions. But I'm not sure why it was done this way, maybe as protection for path traversals?

Primarily to make open_basedir work, and secondarily to properly fill the realpath_cache. So no, we cannot skip this.

In wine headers it's defined as:

struct {
    ULONG Version;
    UCHAR PathBuffer[1];
} LinuxSymbolicLinkReparseBuffer;

Thank you! I'll come up with a PR soon, but it's too late for PHP 8.4 anyway, and I'm still unsure how to support it (maybe by an INI option) for security reasons,