ziglang / zig

General-purpose programming language and toolchain for maintaining robust, optimal, and reusable software.
https://ziglang.org
MIT License
32k stars 2.34k forks source link

`std.fs.IterableDir.Walker` doesn't ignore macOS "firmlinks" as it does symlinks, resulting in iterating some dirs twice and sometimes infinite loops #13355

Open hippietrail opened 1 year ago

hippietrail commented 1 year ago

Zig Version

0.10.0-dev.4324+c23b3e6fd

Steps to Reproduce and Observed Behavior

This is probably not a fully minimal example but I'm not fluent in Zig yet:

const std = @import("std");
const al = std.heap.page_allocator;

pub fn main() !void {
    if (std.os.argv.len == 3) {
        try search(
            std.os.argv[1][0..std.mem.len(std.os.argv[1])],
            std.os.argv[2][0..std.mem.len(std.os.argv[2])]);
        std.debug.print("done\n", .{});
    } else {
        std.debug.print("** first arg is path to walk, second is substring to look for in directory names", .{});
    }
}

// ?!T is easier to work with than !?T for some kinds of iterators
fn flipiter(comptime T: type, inp: anyerror!?T) ?anyerror!T {
    return (try inp) orelse null;
}

fn search(path:[]const u8, string:[]const u8) !void {
    const start_path = if (path[path.len - 1] == '/') path[0..path.len-1] else path[0..];

    var idir = std.fs.cwd().openIterableDir(path, .{}) catch |err| {
        std.debug.print("** err: {s} {any}\n", .{path, err});
    };
    defer idir.close();

    var walker = try idir.walk(al);
    defer walker.deinit();

    while (flipiter(std.fs.IterableDir.Walker.WalkerEntry, walker.next())) |err_or_entry| {
        if (err_or_entry) |entry| {
            if (entry.kind == .Directory) {
                if (std.mem.indexOf(u8, entry.basename, string)) |x| {
                    _ = x;
                    if (entry.dir.realpathAlloc(al, entry.basename)) |realpath| {
                        defer al.free(realpath);

                        std.debug.print("> {s}/{s}\n» {s}\n", .{start_path, entry.path, realpath});
                    } else |err| {
                        std.debug.print("** realpath err {?}\n", .{err});
                    }
                }
            }
        } else |err| {
            if (err != error.AccessDenied) std.debug.print("{!}\t{s}/{s}\n", .{err, start_path, walker.name_buffer.items});
        }
    }
}

Run it with a starting path and a substring to search for. It will walk the directory and print out all paths that contain the substring after a > symbol and then its resolved canonical path after a » symbol on the next line.

I can only see the wrong behaviour when running in the root directory / as macOS has several directories from /System/Volumes/Data/x linked to /x using "firmlinks", which is a feature of the APFS filesystem going back to Sierra. Example firmlinked directories.

I can't find Apple official documentation on firmlinks. Apparently they're supposed to be transparent and not affect most code. I also don't know how to get them to show up using ls. The ls commandline I know that will show the most extra info macOS-specific info is ls -alO@ / but that doesn't seem to reveal them unless sunlnk means this and only this:

drwxr-xr-x  20 root  wheel  sunlnk             640 24 Aug 16:59 .
drwxr-xr-x  20 root  wheel  sunlnk             640 24 Aug 16:59 ..
lrwxr-xr-x   1 root  admin  -                   36 24 Aug 16:59 .VolumeIcon.icns -> System/Volumes/Data/.VolumeIcon.icns
----------   1 root  admin  -                    0 24 Aug 16:59 .file
drwxr-xr-x   2 root  wheel  hidden              64 24 Aug 16:59 .vol
drwxrwxr-x  44 root  admin  sunlnk            1408 28 Oct 09:31 Applications
drwxr-xr-x  67 root  wheel  sunlnk            2144 14 Sep 10:42 Library
drwxr-xr-x@  9 root  wheel  restricted         288 24 Aug 16:59 System
    com.apple.rootless     0 
drwxr-xr-x   5 root  admin  sunlnk             160 24 Aug 16:59 Users
drwxr-xr-x   3 root  wheel  hidden              96 25 Oct 15:39 Volumes
drwxr-xr-x@ 38 root  wheel  restricted,hidden 1216 24 Aug 16:59 bin
    com.apple.rootless     0 
drwxrwxr-t   2 root  admin  hidden              64  8 Dec  2020 cores
dr-xr-xr-x   4 root  wheel  hidden            4888 25 Oct 15:39 dev
lrwxr-xr-x@  1 root  wheel  restricted,hidden   11 24 Aug 16:59 etc -> private/etc
    com.apple.rootless     0 
lrwxr-xr-x   1 root  wheel  hidden              25 25 Oct 15:39 home -> /System/Volumes/Data/home
drwxr-xr-x   2 root  wheel  hidden              64  8 Dec  2020 opt
drwxr-xr-x   6 root  wheel  sunlnk,hidden      192 25 Oct 15:39 private
drwxr-xr-x@ 65 root  wheel  restricted,hidden 2080 24 Aug 16:59 sbin
    com.apple.rootless     0 
lrwxr-xr-x@  1 root  wheel  restricted,hidden   11 24 Aug 16:59 tmp -> private/tmp
    com.apple.rootless     0 
drwxr-xr-x@ 11 root  wheel  restricted,hidden  352 24 Aug 16:59 usr
    com.apple.rootless     0 
lrwxr-xr-x@  1 root  wheel  restricted,hidden   11 24 Aug 16:59 var -> private/var
    com.apple.rootless     0 

Sample commandline:

zig run src/walk.zig 2>&1 -- / LLVM |less -S

Actual output:

> /Library/Frameworks/Xamarin.iOS.framework/Versions/15.10.0.5/LLVM
» /Library/Frameworks/Xamarin.iOS.framework/Versions/15.10.0.5/LLVM
> /System/Volumes/Data/Library/Frameworks/Xamarin.iOS.framework/Versions/15.10.0.5/LLVM
» /Library/Frameworks/Xamarin.iOS.framework/Versions/15.10.0.5/LLVM
> /System/Volumes/Data/Users/hippietrail/.vscode-insiders/extensions/ms-vscode.cpptools-1.5.1/LLVM
» /Users/hippietrail/.vscode-insiders/extensions/ms-vscode.cpptools-1.5.1/LLVM
> /System/Volumes/Data/Users/hippietrail/.vscode/extensions/ms-vscode.cpptools-1.12.4-darwin-arm64/LLVM
» /Users/hippietrail/.vscode/extensions/ms-vscode.cpptools-1.12.4-darwin-arm64/LLVM
> /System/Volumes/Data/Applications/Xcode.app/Contents/Applications/Instruments.app/Contents/PlugIns/DTLLVMBinaryAnalysisPlugin.xrplugin
» /Applications/Xcode.app/Contents/Applications/Instruments.app/Contents/PlugIns/DTLLVMBinaryAnalysisPlugin.xrplugin
> /Users/hippietrail/.vscode-insiders/extensions/ms-vscode.cpptools-1.5.1/LLVM
» /Users/hippietrail/.vscode-insiders/extensions/ms-vscode.cpptools-1.5.1/LLVM
> /Users/hippietrail/.vscode/extensions/ms-vscode.cpptools-1.12.4-darwin-arm64/LLVM
» /Users/hippietrail/.vscode/extensions/ms-vscode.cpptools-1.12.4-darwin-arm64/LLVM
> /Applications/Xcode.app/Contents/Applications/Instruments.app/Contents/PlugIns/DTLLVMBinaryAnalysisPlugin.xrplugin
» /Applications/Xcode.app/Contents/Applications/Instruments.app/Contents/PlugIns/DTLLVMBinaryAnalysisPlugin.xrplugin
done

Expected Behavior

Directories should only be listed at their true locations as is the case with symlinks, not also at the locations they're firmlinked to. I believe that would look like this:

> /Library/Frameworks/Xamarin.iOS.framework/Versions/15.10.0.5/LLVM
» /Library/Frameworks/Xamarin.iOS.framework/Versions/15.10.0.5/LLVM
> /Users/hippietrail/.vscode-insiders/extensions/ms-vscode.cpptools-1.5.1/LLVM
» /Users/hippietrail/.vscode-insiders/extensions/ms-vscode.cpptools-1.5.1/LLVM
> /Users/hippietrail/.vscode/extensions/ms-vscode.cpptools-1.12.4-darwin-arm64/LLVM
» /Users/hippietrail/.vscode/extensions/ms-vscode.cpptools-1.12.4-darwin-arm64/LLVM
> /Applications/Xcode.app/Contents/Applications/Instruments.app/Contents/PlugIns/DTLLVMBinaryAnalysisPlugin.xrplugin
» /Applications/Xcode.app/Contents/Applications/Instruments.app/Contents/PlugIns/DTLLVMBinaryAnalysisPlugin.xrplugin
done
hippietrail commented 1 year ago

A more-or-less equivalent Rust program shares the behaviour of Zig, but Swift does not. This makes me think the dir walker code in each language's libraries does in fact need to handle firmlinks specifically and that Swift must do do. This despite claims they should be transparent to existing code.

Here's my Swift code's output:

dirwalker / LLVM
******** / ********
Library/Frameworks/Xamarin.iOS.framework/Versions/15.10.0.5/LLVM
Users/hippietrail/.vscode-insiders/extensions/ms-vscode.cpptools-1.5.1/LLVM
Users/hippietrail/.vscode/extensions/ms-vscode.cpptools-1.12.4-darwin-arm64/LLVM
Applications/Xcode.app/Contents/Applications/Instruments.app/Contents/PlugIns/DTLLVMBinaryAnalysisPlugin.xrplugin
---- all done ----

And the Swift code itself. You can see there's no special code for firmlinks:

import Foundation
import AppKit

let fileManager = FileManager.default

let resKeys : [URLResourceKey] = [.isDirectoryKey, .fileSizeKey, .isSymbolicLinkKey]

let startURL: URL = URL(string: fileManager.currentDirectoryPath)!

guard CommandLine.arguments.count == 3 else {
    print("** usage: dirwalker path string")
    exit(1);
}

let pathArg = CommandLine.arguments[1]
let matchArg = CommandLine.arguments[2]

if let path = pathArg.addingPercentEncoding(withAllowedCharacters: .urlQueryAllowed) {
    if let url = URL(string: path) {
        let en = fileManager.enumerator(at: url,
                                        includingPropertiesForKeys: resKeys,
                                        options: [.producesRelativePathURLs],
                                        errorHandler: { (url, error) -> Bool in
            return true }
        )!

        mainloop: for case let fileURL as URL in en {
            do {
                let rv = try fileURL.resourceValues(forKeys: Set(resKeys))
                if let d = rv.isDirectory, d {

                    let filename: String = fileURL.lastPathComponent;

                    if filename.contains(matchArg) {
                        print(fileURL.relativePath)
                    }
                }
            } catch {
                print("** error 2:", error)
            }
        }
    }
}

print("done")
xEgoist commented 1 year ago

I did research this issue a while back, but here are my findings/notes in case it might help someone who wants to implement it or comment their opinion on it:

What's interesting is that MacOS's /bin/realpath does not resolve firmlinks.


Here are way that I think can be used to solve this is:

I am not sure what's the best approach here as there aren't many system firmlinks and it's uncommon to synthesize (if that's even possible).


Note: I could be missing some other info, but this is what I have found.