antecedent / patchwork

Method redefinition (monkey-patching) functionality for PHP.
https://antecedent.github.io/patchwork/
MIT License
444 stars 40 forks source link

url_stat implementation misses links, quiet, and stat cache behavior #116

Closed anomiex closed 3 years ago

anomiex commented 3 years ago

I noticed three issues in the implementation of Patchwork\CodeManipulation\Stream::url_stat().

The first is that warnings are only supposed to be silenced when the STREAM_URL_STAT_QUIET flag was passed. I was unable to find a test case illustrating this as a problem though. When I look through PHP's code to see how this is handled, it looks like they literally just skip calling the log function.

The second is that, when the STREAM_URL_STAT_LINK flag is passed, it should behave as lstat() rather than as stat(). This can be illustrated with the following code:

file_put_contents( 'src.dat', str_repeat( "x", 1024 ) );
symlink( 'src.dat', 'link.dat' );

printf( "Stat vs lstat data for link.dat:\n" );
$stat = stat( 'link.dat' );
$lstat = lstat( 'link.dat' );
foreach ( $stat as $k => $v ) {
    if ( ! is_int( $k ) ) printf( " %10s: %-10s %s\n", $k, $stat[$k], $lstat[$k] );
}

When Patchwork is not loaded, this will correctly print differing output for the ino, mode, size, and blocks fields. When Patchwork is loaded, both columns show the data for src.dat.

The third (and the one that got me looking at this in the first place) is that PHP's own 'file' wrapper ignores the stat cache, neither using it nor populating it. This can be illustrated by the following code:

file_put_contents( 'src.dat', str_repeat( "x", 1024 ) );
$dest = tempnam( sys_get_temp_dir(), 'temp' );

copy( 'src.dat', $dest );
printf( "Size of dest after: %d\n", filesize( $dest ) );

PHP will call url_stat() for both the source and destination. Normally the stat cache winds up holding the data from the source file because copy() passes a flag that causes php_stream_stat_path_ex() to not do the stat caching. But Patchwork's url_stat() itself updates the stat cache on every call due to using stat().

Unfortunately PHP gives us no good way to stat a file while ignoring the stat cache, so probably the best we can do is just clear it before and after calling stat() or lstat(). If the cache is supposed to be populated, PHP will re-populate it after our url_stat() returns. If not, we'll at least be avoiding lack-of-cache-invalidation bugs instead of causing them.

Side note: They changed some stat cache behavior in PHP 8.1 (13e4ce38), now neither call in copy() touches the stat cache, and various other calls as well probably. Which means they've introduced different potential lack-of-cache-invalidation bugs that we have little or no hope of trying to remain bug-for-bug compatible with.

As for fixing all this, maybe something like this?

diff --git a/src/CodeManipulation/Stream.php b/src/CodeManipulation/Stream.php
index 115c263..b8dc33e 100644
--- a/src/CodeManipulation/Stream.php
+++ b/src/CodeManipulation/Stream.php
@@ -98,14 +98,21 @@ class Stream

     public function url_stat($path, $flags)
     {
+        $func = ( $flags & STREAM_URL_STAT_LINK ) ? 'lstat' : 'stat';
         $this->unwrap();
-        set_error_handler(function() {});
-        try {
-            $result = stat($path);
-        } catch (\Exception $e) {
-            $result = null;
+        clearstatcache();
+        if ($flags & STREAM_URL_STAT_QUIET) {
+            set_error_handler(function() {});
+            try {
+                $result = call_user_func($func, $path);
+            } catch (\Exception $e) {
+                $result = null;
+            }
+            restore_error_handler();
+        } else {
+            $result = call_user_func($func, $path);
         }
-        restore_error_handler();
+        clearstatcache();
         $this->wrap();
         if ($result) {
             $result[self::STAT_MTIME_ASSOC_OFFSET]++;