SUPERCILEX / fuc

Modern, performance focused unix commands
Apache License 2.0
340 stars 8 forks source link

[cpz] Symbolic Links #13

Closed Elvyria closed 1 year ago

Elvyria commented 1 year ago

While copying symbolic link cpz copies the whole directory.

cp (GNU coreutils) 8.32 by default if symlink points to a: Directory - will not follow symbolic link, scream that it needs an -r flag and copy it as is File - will follow symbolic link and copy file itself

Default behavior probably requires the same treatment as https://github.com/SUPERCILEX/fuc/issues/11. Not a critical issue, but default behavior is different and probably not expected.

Version: 1.1.2

SUPERCILEX commented 1 year ago

Thanks for the investigation!

Directory - will not follow symbolic link, scream that it needs an -r flag and copy it as is

BTW, this has nothing to do with symbolic links. cp just follows it, see it's a directory, and then goes, "I don't copy things recursively without the -r flag."


Aside from the deletion goof, the rest of symlink interactions are actually WAI. Here's the idea: symlinks should be handled transparently. That is, you shouldn't need to know whether or not something is a symlink.

For deletion, this means you always delete the symlink itself since if the user thought it was a regular file, that's what they'd expect.

For copying, the story becomes a little more nuanced. When you say "copy a to b," I think by the same logic we should copy what the symlink points to (because that's what you'd expect if you didn't know it was a symlink). This part mirrors cp's behavior. Where it gets interesting is when you encounter a symlink inside "a". Then, copying the symlink itself makes more sense because you're trying to make an exact mirror of a directory and if "a" had symlinks within it then "b" should too. Also, if you try to chase the symlinks then you run into some madness inducing edge cases:

So the TL;DR is that I think the current behavior actually makes sense.

Elvyria commented 1 year ago

Maybe my workflow is just different, but when i copy things I'm usually well aware of what they are, if it's a file, link or a directory. If we take Windows, just as a side example, they have desktop icons, which, from my understanding, are just fancy symlinks, and if user were to copy one of them, i don't think the expectation would be to copy what's underneath. And as you said What happens when the symlink points to something that doesn't exist? Fail the whole copy? for this exact reason i actually don't like what cp does by default for files (which i forgot), when i see something and try to copy it, in my eyes, the only thing that should stay in a way of my copy is IO error (or if i explicitly told it to follow link and it failed which probably falls under IO error anyway).

But that's all just my opinion and of course I'm not enforcing it. How your software behaves by default can be whatever you think best. The bottom line is that it would be really nice to have an option to copy things as i see them (literally).

Aside from that.

Recursion on itself, after a while, fails with: cannot stat 'rec/rec1/rec/rec1...': File name too long Co-recursion fails with: cannot stat 'rec/rec1': Too many levels of symbolic links

SUPERCILEX commented 1 year ago

Hmmm, yeah what you're saying does sound reasonable, but I'm still a little confused. My main question about copying top-level symlinks: how do you prevent bogus links? If you have a relative link (like foo/bar or ../blah), copying the symlink changes its meaning which seems odd to me. At that point, why not explicitly make a new link? Granted, copying symlinks in a directory has the same problem if the symlink jumps out of the directory you're copying, but I feel like that's rarer.

One benefit to going with your approach is that everything becomes consistent (always operate on the symlink, not what it's pointing to). Still, I'm bothered by what I pointed out above.

Elvyria commented 1 year ago

You're right, copying symlink doesn't always make sense. If symlink is relative and I'm aware of that, then there's not much situations when it's reasonable to copy it, but symlinks that i do want to copy are usually not relative, but absolute and so it just saves me time copying path with mouse.

Consistency in behavior is probably the main advantage for me. I see symlinks simply as shortcuts to a certain path, the link should be resolved while something is trying to read and write data to it, but unlike reflinks or hardlinks they don't represent the data underneath by themselves. They can point to a location that tomorrow represents something else than today or location that doesn't even exist at the current moment, but might in a couple user interactions later.

So the explicit flag to copy data underneath instead of a link itself is more preferable to me, but it's just me.

By the way, found an inconsistent behavior in cpz while copying directory:

SUPERCILEX commented 1 year ago

If current directory is a symlink (~/downloads -> /mnt/.../Downloads) it follow all file symlinks under the copied directory and copy respectable files, it will try to follow all directory symlinks and fail with Is a directory (os error 21)

Are you on macOS? I fixed that after yesterday's emergency release: https://github.com/SUPERCILEX/fuc/commit/c9705840f05092627e2441182665de9bf330dc2e.


I'm convinced. :) I'll make it so we never chase symlinks.

Elvyria commented 1 year ago

No, I'm on Void Linux, with glibc 2.36 and symlink points from btrfs to ntfs3 (~/downloads -> /mnt/.../Downloads) mounted filesystem, building from source doesn't resolve the issue for me, considering that i repeated the experiment correctly.

I'm pretty sure that everything build related is up to date, if you can't reproduce and need more info, feel free to ask.

SUPERCILEX commented 1 year ago

Damn, already lol? Someone on Reddit pointed this out, but it was all theory and I didn't believe anyone would run into this in the real world. Anyway, I just need to handle d_type unknown correctly. (Your issue has nothing to do with symlinks, it's ntfs3 not supporting d_type so if you try to copy any nested directories it'll fail.) Will fix too. 👍

Elvyria commented 1 year ago

Good thing i provided you with filesystem types then, that's a bit unexpected. Copying directories with regular files or nested directories doesn't cause any errors and seems to work fine (with both 1.1.2 and c9705840f05092627e2441182665de9bf330dc2e), not sure if this is a useful information. Thanks for your hard work.

SUPERCILEX commented 1 year ago

Ok, everything should be working now as long as I didn't completely misunderstand the issue. :sweat_smile: If you can double check, that'd be very much appreciated. :)

Elvyria commented 1 year ago

It does something weird. I have link L1 that points to A1 (L1 -> A1), when i copy - cpz L1 L2, instead of creating L2 -> A1 it creates L2 -> L1.

std::os::unix::fs::symlink(&from, &to) if copying symlink is not a thing, reading location from from -> path and replacing from with path is the way to go. Which should result in a to -> path

Not sure if 15fa3a7c3675717d5d17251b1d6afec8c6871715 is WIP or not, but it currently doesn't solve behavior with NTFS3, but i checked just in case, you never said that it's solved.

  • If current directory is a symlink (~/downloads -> /mnt/.../Downloads) it follow all file symlinks under the copied directory and copy respectable files, it will try to follow all directory symlinks and fail with Is a directory (os error 21)
  • If current directory is not a symlink (~) it will copy directory hierarchy as is.

Version: 15fa3a7c3675717d5d17251b1d6afec8c6871715

SUPERCILEX commented 1 year ago

I have link L1 that points to A1 (L1 -> A1), when i copy - cpz L1 L2, instead of creating L2 -> A1 it creates L2 -> L1.

Oops, yeah I tired brained that one, thanks for the catch! https://github.com/SUPERCILEX/fuc/commit/76ce56c2e6cee5d3acc200a87bf58d6417113527

it currently doesn't solve behavior with NTFS3

Well shucks, then I'm probably wrong and it has nothing to do with d_type. Could you share the error stack trace so I can see which syscall failed (should be the last line)? I'm thoroughly confused because I don't see how you could get Is a directory when all we're doing is copying the symlink.

SUPERCILEX commented 1 year ago

Could you share the error stack trace so I can see which syscall failed

Oh wait I just remembered that's not a thing in this project. If the bug is still there (can you make sure you're using the built-from-source version?), I'll put up a patch that debug prints each error.

Elvyria commented 1 year ago

I was able to extract some info:

Directory structure that i'm trying to copy (top is not a symlink):

testdir (DIR) ├── A1 (DIR) │ └── F1 (FILE) ├── B1 -> A1(LINK) ├── man (FILE) └── woman -> man (LINK)

Error (lines don't match repository, useless anyway):

Error: An I/O error occurred ├╴ cpz/src/main.rs:67:18 │ ╰─▶ Is a directory (os error 21) ├╴ cpz/src/main.rs:65:17 ╰╴ Failed to copy file: "testdir/B1"

Backtrace (useful part, lines should match):

3: fuc_engine::ops::linux::<impl fuc_engine::ops::IoErr<core::result::Result<T,fuc_engine::Error>> for core::result::Result<T,rustix::backend::io::errno::Errno>>::map_io_err at .../fuc/fuc_engine/src/ops/mod.rs:44:13

4: fuc_engine::ops::copy::compat::copy_regular_file at .../fuc/fuc_engine/src/ops/copy.rs:311:22

5: fuc_engine::ops::copy::compat::copy_one_file at .../fuc/fuc_engine/src/ops/copy.rs:291:38

6: fuc_engine::ops::copy::compat::copy_dir at .../fuc/fuc_engine/src/ops/copy.rs:250:17

7: fuc_engine::ops::copy::compat::root_worker_thread::{{closure}} at .../fuc/fuc_engine/src/ops/copy.rs:187:21

So it tries to copy directory symlink as a file, fails, thinking it's actually a directory, but it's actually a symlink and so that's it. Works fine on btrfs, fails on ntfs3, $PWD is probably irrelevant from testing.

Version: 76ce56c2e6cee5d3acc200a87bf58d6417113527

SUPERCILEX commented 1 year ago

That is so weird. Maybe it's lying about the d_type? Can you cargo install rust-script, stick this in a file (test.rs):

//! ```cargo
//! [dependencies]
//! rustix = { version = "*", features = ["all-apis"] }
//! ```

use rustix::fs::{cwd, Mode, OFlags, openat, RawDir};

fn main() {
    let path = std::env::args_os().nth(1).unwrap();
    let fd = openat(cwd(), path, OFlags::RDONLY | OFlags::DIRECTORY, Mode::empty()).unwrap();

    let mut buf = Vec::with_capacity(8192);
    let mut iter = RawDir::new(fd, buf.spare_capacity_mut());
    while let Some(entry) = iter.next() {
        let entry = entry.unwrap();
        dbg!(&entry);
    }
}

And then run it with rust-script test.rs testdir?

This is what it prints out for me:

[test.rs:16] &entry = RawDirEntry {
    file_name: ".",
    file_type: Directory,
    ino: 45938801,
    next_entry_cookie: 1,
}
[test.rs:16] &entry = RawDirEntry {
    file_name: "..",
    file_type: Directory,
    ino: 45938747,
    next_entry_cookie: 2,
}
[test.rs:16] &entry = RawDirEntry {
    file_name: "test.rs",
    file_type: RegularFile,
    ino: 45938802,
    next_entry_cookie: 295941449,
}
[test.rs:16] &entry = RawDirEntry {
    file_name: "man",
    file_type: RegularFile,
    ino: 45940241,
    next_entry_cookie: 391119687,
}
[test.rs:16] &entry = RawDirEntry {
    file_name: "A1",
    file_type: Directory,
    ino: 45940240,
    next_entry_cookie: 395974485,
}
[test.rs:16] &entry = RawDirEntry {
    file_name: "woman",
    file_type: Symlink,
    ino: 45940243,
    next_entry_cookie: 397364480,
}
[test.rs:16] &entry = RawDirEntry {
    file_name: "B1",
    file_type: Symlink,
    ino: 45940244,
    next_entry_cookie: 407482366,
}
Elvyria commented 1 year ago

Yup, it's lying on ntfs3, but not on btrfs

[test.rs:16] &entry = RawDirEntry {
    file_name: ".",
    file_type: Directory,
    ino: 1397903,
    next_entry_cookie: 1,
}
[test.rs:16] &entry = RawDirEntry {
    file_name: "..",
    file_type: Directory,
    ino: 635,
    next_entry_cookie: 16,
}
[test.rs:16] &entry = RawDirEntry {
    file_name: "A1",
    file_type: Directory,
    ino: 1562876,
    next_entry_cookie: 104,
}
[test.rs:16] &entry = RawDirEntry {
    file_name: "B1",
    file_type: RegularFile,
    ino: 1682749,
    next_entry_cookie: 192,
}
[test.rs:16] &entry = RawDirEntry {
    file_name: "man",
    file_type: RegularFile,
    ino: 1705690,
    next_entry_cookie: 280,
}
[test.rs:16] &entry = RawDirEntry {
    file_name: "woman",
    file_type: RegularFile,
    ino: 1749046,
    next_entry_cookie: 1024,
}
SUPERCILEX commented 1 year ago

Ah, ok one last ditch effort:

//! ```cargo
//! [dependencies]
//! rustix = { version = "*", features = ["all-apis"] }
//! ```

use rustix::fs::{cwd, Mode, OFlags, openat, RawDir};

fn main() {
    let path = std::env::args_os().nth(1).unwrap();
    let fd = openat(cwd(), path, OFlags::RDONLY | OFlags::DIRECTORY | OFlags::NOFOLLOW, Mode::empty()).unwrap();

    let mut buf = Vec::with_capacity(8192);
    let mut iter = RawDir::new(fd, buf.spare_capacity_mut());
    while let Some(entry) = iter.next() {
        let entry = entry.unwrap();
        dbg!(&entry);
    }
}

I added OFlags::NOFOLLOW. That's not what it means (If the trailing component (i.e., basename) of pathname is a symbolic link, then the open fails, with the error ELOOP), but maybe ntfs3 will use it?

Elvyria commented 1 year ago

Same story. (cp as well as lf are capable of copying this directory)

Kernel: 6.1.4

SUPERCILEX commented 1 year ago

Ok, should be fixed.

SUPERCILEX commented 1 year ago

And can you double check that the copied symlinks were created correctly too?

Elvyria commented 1 year ago

Copying symlinks works fine. ntfs3 fix didn't help, but (and i probably should've provided this info long ago), directory structure that i get as result looks like this:

.
├── A1 (DIR)
│  └── F1 (FILE)
└── B1 -> ? (LINK)

And it's the same for both 933a4502626197e160a2ebc58b97afd89a369b71 and 7521a0e114e7133117e171715c503bb72bd2c83d, could it be an upstream issue?

SUPERCILEX commented 1 year ago

Alright, yeah I'm going to say ntfs3 is fundamentally broken and give up. A bit of googling shows this isn't just us: https://gitlab.gnome.org/GNOME/nautilus/-/issues/2020