cc-tweaked / CC-Tweaked

Just another ComputerCraft fork
https://tweaked.cc
925 stars 212 forks source link

Absolute and relative paths #1797

Open PrincessRTFM opened 6 months ago

PrincessRTFM commented 6 months ago

I'm going to be honest: path handling in CC is kind of atrocious right now.

The fs and io APIs use absolute paths, the shell API handles relative ones, and only functions that are specifically built to handle relative paths differentiate between absolute-style with a leading / and relative-style without. Any functions that don't simply assume that all paths are absolute. If you call shell.resolve(path) from outside the root, it returns what looks for all the world like a relative path, and if you don't keep track of which paths you've resolved and which you haven't, you'll easily end up mangling them. Right now, unless you know how a specific program handles its paths, it's just not safe to call programs from anywhere but the root, since that's the only place where it doesn't matter either way.

I've been writing a custom path handling API, on top of which I've been further building a custom fs API that specifically recognises absolute paths as beginning with a /, but this is really just a personal band-aid solution for a fundamental problem with CC itself: inconsistent path handling makes it easy for things to break.

I know it would be a fairly significant task, but I'm officially submitting a request that path handling be reworked for consistency, making all APIs support both relative and absolute paths, differentiating based on the presence or absence of a leading /.

I'm aware that, if handled incorrectly, this could break existing (poorly-written) programs that rely on fs and io treating even relative-looking paths as absolute. To that end, perhaps a new pair of APIs could be implemented as a sort of compatibility layer, such as absfs and absio. On the other hand, since running programs from the root makes it irrelevant and that's how most users run their programs, it may not be enough of a break to actually matter.

Wojbie commented 6 months ago

Question for this suggestion. For fs and io apis.. relative path would be relative to.. what? Cause current directory and stuff are shell level concepts and not really applicable to filesystem level.

PrincessRTFM commented 6 months ago

Which is, honestly, also not a great design choice. If a program is running, it has a current directory. That's an attribute of live processes. The filesystem doesn't exist in a vacuum. A running program's active directory shouldn't be limited to a single API. That's most of my problem, actually - it currently is, but that doesn't make sense.

Wojbie commented 6 months ago

What about program executed by different program from rednet message? Or program executed by pastebin run or wget run ? Those don't have a directory they exist in and would basically have relative paths of their parent programs? [That is already currently a issue with shell.resolve(path) and those types of programs]

PrincessRTFM commented 6 months ago

They do, or at least they should. When you run pastebin run or wget run, its current directory is the directory you run it from. It then runs something else, which should inherit the current directory. This is how basically every modern shell works. Everything has a working directory. Any child processes they launch inherit the same working directory. Child processes can't change the working directory of their parents (which is possible in CC, and shouldn't be for the same reason it isn't possible in real shells).

SquidDev commented 6 months ago

Yeah, the way paths are handled within CC has been a long-standing complaint of mine. It's confusing, and very easy to get wrong (indeed, even CC screws this up). Unfortunately, it's not obvious to me how best to fix this.

I think there's two separate issues here:

For fs and io apis, relative path would be relative to what? Cause current directory and stuff are shell level concepts and not really applicable to filesystem level.

This is definitely workaroundable. You can do something similar to OC, and have some hidden global state that stores the current working directory, and set/restore that whenever a program is resumed/yielded.

hugeblank commented 6 months ago

However, I'm not sure we could change fs.combine/fs.getDir without causing issues. I've definitely written code like while path ~= "" do f(path); path = fs.getDir(path) end in the past, which would no longer work if fs.getDir("/foo") returns "/".

I feel like this is just a symptom of the problem rather than a reason not to fix it. Take a gander at code that uses fs.getDir. There seems to be a lot more instances of brute forcing the path to be absolute path = '/'..fs.getDir(...) than code that expects an empty string as root fs.getDir(...) == "". Three I found back to back: image 1 instance that anticipates either "/" or "" (good job squid) 2 instances that brute force the path to root. (the middle one does so earlier, trust)

PrincessRTFM commented 6 months ago

Very few programs bother to put / at the front of paths, so this will break a lot of code.

Only if that code is executing from outside of the filesystem root, to be fair. For all programs, running from / itself means that absolute and relative paths will point to the same place. And I think that the consistency and sanity of relative-looking paths all actually being relative is worth the risk that running old programs from outside / might have issues. I doubt it's overly common in the first place in the first place, at least for non-core programs.

The issues with fs.combine and fs.getDir are bigger problems, because I wouldn't be surprised if that's not an uncommon thing people have done, and that would break no matter where the code is running from. I don't think there's a way to fix it and maintain backwards-compatibility, at least not without some really unfortunate kludge to fs.combine... fs.getDir could be given another argument with a boolean indicating whether you want it to return "/" instead of "", but the only solution I can think of for fs.combine is that if the first argument is "/" then it should return a "really" absolute path. Even that much could be an issue for some programs, not to mention being a bit of a hack.

On the other hand, a new API with the same methods that do operate with properly formed paths would mean a whole second place that needs to be kept updated alongside the first, even if it wouldn't require copying all the code. My existing project defines a path API for handling of string paths, and then a replacement fs API that programmatically wraps lists of methods from the underlying core fs API to just process path arguments before passing them through, but if any new methods are added to fs then I'd need to update it appropriately:

-- Most wrapped functions only need to resolve a single given path
-- and then pass it directly to the underlying fs method
for _, func in ipairs(singlePathArgumentMethods) do
    fslib[func] = function(name, ...) return fs[func](path.cwd(name), ...) end
end

It may be time to decide how important backwards-compatibility is for old CC programs.

AnrDaemon commented 2 months ago

Sorry if I step on somebody's toes, I just found this topic.

A complete solution should include quite a few moments, if you ask me. When shell starts a program, it should pass to it a clone of its configured FS object as well as a clone of its configured STDIO object. These objects should inside themselves contain the copies of references to current working directory (FS) and IO streams, if any (STDIO). The program then could change them as it see fit, but these changes wont affect the parent objects. Once that's done, you get the rest for free.