libsdl-org / SDL

Simple Directmedia Layer
https://libsdl.org
zlib License
9.99k stars 1.84k forks source link

SDL3 filesystem API #8129

Closed icculus closed 5 months ago

icculus commented 1 year ago

People have asked for this for years, and it's probably worth adding it.

The problem

Windows. Windows is the problem.

The solution

We could do this as complicated as we want or just make a simple wrapper over mkdir, opendir, etc.

Relatively simple:

typedef enum SDL_StatPathType
{
    SDL_STATPATHTYPE_REGULAR, /**< a normal file */
    SDL_STATPATHTYPE_DIRECTORY, /**< a directory */
    SDL_STATPATHTYPE_SYMLINK, /**< a symlink */
    SDL_STATPATHTYPE_OTHER /**< something completely different like a device */
} SDL_StatPathType;

typedef struct SDL_Stat
{
    Sint64 filesize;
    Sint64 modtime;
    Sint64 createtime;
    Sint64 accesstime;
    SDL_StatPathType filetype;
} SDL_Stat;

typedef struct SDL_DirectoryHandle SDL_DirectoryHandle;  // opaque

extern DECLSPEC const char * SDLCALL SDL_GetFilePathSeparator(void);  // "\\" or "/" 
extern DECLSPEC void SDLCALL SDL_FileTimeToWindows(Sint64 ftime, Uint32 *low, Uint32 *high);
extern DECLSPEC Sint64 SDLCALL SDL_FileTimeToUnix(Sint64 ftime);
extern DECLSPEC int SDLCALL SDL_StatPath(const char *path, SDL_Stat *statdata);
extern DECLSPEC int SDLCALL SDL_CreateDirectory(const char *path);
extern DECLSPEC int SDLCALL SDL_RemoveDirectory(const char *path);
extern DECLSPEC int SDLCALL SDL_RenameFile(const char *oldpath, const char *newpath);
extern DECLSPEC int SDLCALL SDL_DeleteFile(const char *path);
extern DECLSPEC SDL_DirectoryHandle *SDL_OpenDirectory(const char *path);
extern DECLSPEC char *SDL_ReadDirectory(SDL_DirectoryHandle *dir);
extern DECLSPEC void SDL_CloseDirectory(SDL_DirectoryHandle *dir);

Even this has a problem: there are legit errors one will want to check for without parsing SDL_GetError(), like mkdir() failed because the directory already existed? Or unlink() failed because the file didn't exist? That's totally recoverable by just carrying on. So we'll want to return something other than -1 or 0 for these, but that can be a whole can of worms.

We don't want to get to the point where this is exposes the entire system API (like...file permissions, etc), but in terms of complexity, there might be some value in simplifying a few things (RenameFile making efforts to copy the file transparently if crossing a filesystem boundary, DeleteFile being smart enough to remove a directory, a function that will delete a whole tree instead of individual nodes, etc).

Do we want to mess with symlinks (making them and reading them)? I'm thinking no.

I don't necessarily think we should do an abstract interface like RWops ("look, I exposed my email inbox as a directory listing!"), but certainly PhysicsFS has made use of this idea to expose .zip files and other archives like filesystems, so I won't pretend the idea isn't useful, but just hiding the platform specifics and C runtime here might be a net positive and we call it a day without going wild.

Looking for feedback before I start on this in earnest.

icculus commented 8 months ago

The idea is that implementations can be built out for other things (like android data, zip files, whatever). I don't know if I'll have time to implement an Android thing in this first cut, but offering that in SDL isn't a bad idea!

flibitijibibo commented 8 months ago

Asset bundles might fall under something that can be implemented as higher-level title storage, so if not _FS it could maybe go into the _StorageContainer idea instead?

icculus commented 8 months ago

So just to ramble about operations a little more, here are all the commands you can implement for a FUSE filesystem (basically a userspace app that looks like a mounted Unix filesystem to the rest of the machine), which qualifies as a reasonably complete set of Things You Can Do With A Filesystem.

Some things we're probably NOT going to do:

This leaves almost nothing in the FUSE interface that our basic operations don't cover, except:

Of these...rename is probably the only important operation here.

That being said, we could do a lot of these, with the assumption that most implementations are going to simply return SDL_Unsupported(); But right now I think I'm just going to add rename and stop worrying. This keeps FSops easy to understand and use, and also simple to implement.

We could add a few "reserved" fields that current implementations are expected to set to NULL, for future expansion, but I'd rather we all agree this is the set of operations we are going with for SDL3.

ericoporto commented 8 months ago

Curious, will this API (SDL_FSops) be used elsewhere in SDL itself? Like, say I don't need it directly, can I disable? If I am building SDL from source. Say I don't want to use whatever there is for whatever platform, I would prefer to use my own implementation.

icculus commented 8 months ago

I think we can disable the subsystem completely still, or you can just use the "dummy" backend, which is like five return -1 functions. :)

slouken commented 8 months ago

Curious, will this API (SDL_FSops) be used elsewhere in SDL itself? Like, say I don't need it directly, can I disable? If I am building SDL from source. Say I don't want to use whatever there is for whatever platform, I would prefer to use my own implementation.

Yes, it will be used elsewhere in the SDL API and will be relied on. We've been moving away from making core subsystems disabled, and the filesystem API will be the same. However, in terms of code it's going to be tiny, so it won't have significant impact on the library size.

slouken commented 8 months ago

So just to ramble about operations a little more, here are all the commands you can implement for a FUSE filesystem (basically a userspace app that looks like a mounted Unix filesystem to the rest of the machine), which qualifies as a reasonably complete set of Things You Can Do With A Filesystem.

Some things we're probably NOT going to do:

  • extended attributes
  • symlinks (creation and readlink)
  • device nodes (mknod)
  • separate unlink and rmdir (we have them rolled into one method, like the POSIX "remove" function)
  • hard links
  • permissions (chmod and chown, access())
  • file i/o (this is handed off to SDL_RWops, for which the SDL_FSops can provide a custom implementation)
  • ioctl (strictly speaking this would be RWops anyhow)
  • file locking

I think of all of these, permissions and file locking are the most likely to be wanted in the future. There are a bunch of edge cases around umask not being set properly, and lots of code uses access() as a proxy for stat() + able to open the file. Checking executable permission, etc. File locking is also pretty common for things like exclusive application startup and so forth.

Can I also request directory and file watching? That's super useful in a bunch of contexts, including SDL_steam_virtual_gamepad.c

slouken commented 8 months ago

Making this replaceable is interesting, but where would you pass the interface pointer? Are we adding other APIs that require a filesystem interface? Are we allowing you to set the default filesystem interface?

ericoporto commented 8 months ago

Ahn, it can't be disabled and it can't be replaced? I am mostly afraid of being in a situation similar to this again : https://github.com/icculus/SDL_sound/issues/69

Essentially, if I have my own game package and SDL ends up requiring having some file on disk will I be able to have this file inside my own game package?

ell1e commented 8 months ago

At the risk of being a broken record, I agree with @ericoporto It really should be replaceable with a user-injected interface. If only for the case where somebody wants it to work with all Windows paths in case of malformed UTF16 with wrong UTF32 surrogates that Windows seems to allow, if that isn't supported.

slouken commented 8 months ago

Making this replaceable is interesting, but where would you pass the interface pointer? Are we adding other APIs that require a filesystem interface? Are we allowing you to set the default filesystem interface?

I guess my question is more, should we be doing something like this?

... SDL filesystem API functions ... SDL_FSops SDL_SetFileSystemOps(SDL_FSops ops); SDL_AddFileSystemOps(SDL_FSops ops) (chains existing operations so you can overlay pack files, etc. on top of real filesystem structure)

slouken commented 8 months ago

As a followup question, if we allow replacing filesystem operations, what do we do with udev enumeration and Steam controller description file watching? Do we leave those as native filesystem code always?

icculus commented 8 months ago

@slouken and I talked through some of the remaining design details and deficiencies, and here's a short list of how I'm moving forward.

icculus commented 8 months ago

Oh, and:

icculus commented 8 months ago

all Windows paths in case of malformed UTF16 with wrong UTF32 surrogates that Windows seems to allow, if that isn't supported.

I'll take a reasonable patch to support this after I finish getting the initial code in place, but it's not high enough priority for me to risk implementing this incorrectly (and should it go into the filesystem code? iconv? And what do we expect apps to do when we hand them a malformed UTF-8 string to represent a malformed UTF-16 string that is reported from FindNextFile()?).

icculus commented 8 months ago

Essentially, if I have my own game package and SDL ends up requiring having some file on disk will I be able to have this file inside my own game package?

I don't think it'll be an issue with what we're describing here. The entire problem in SDL_sound is that one supported file format needs external metadata and SDL_sound has no formal way to supply that data beyond an environment variable with a filename. If it had a way to accept it through the abstract interface of SDL_RWops, there'd be no problem.

icculus commented 8 months ago

(As for "can it be disabled and replaced?" ... it's an abstract interface, replace it with any implementation you like at runtime, or just use the system filesystem APIs directly without going through SDL. You aren't going to be forced into using it, afaict.)

slouken commented 8 months ago

As a followup question, if we allow replacing filesystem operations, what do we do with udev enumeration and Steam controller description file watching? Do we leave those as native filesystem code always?

We don't have global filesystem operations, so all the internal filesystem code will remain unchanged.

ell1e commented 8 months ago

And what do we expect apps to do when we hand them a malformed UTF-8 string to represent a malformed UTF-16 string that is reported from FindNextFile()?).

Python and such (Python to my knowledge uses these type of paths by default) expect 99% of apps not to care. Which happens to be true in practice, since it's not obviously malformed. If I remember right with how it works, you still get UTF8 code points that have the usual bit representation, they just happen to be UTF16 surrogate code points that you wouldn't ever have in valid UTF8 and you would only in valid UTF16 and only in valid pairs when for this surrogate encoding they end up being invalid pairs. All code I ever wrote that implements such surrogate encodings does the decode/encode in the functions that wrap the windows filesystem ones, I found the natural place to put this usually right before and right after where it goes into the Windows API.

These broken paths to my knowledge only usually end up as a problem for the apps that aren't aware of them as soon as the app tries to decode/encode to UTF16 or other variants on its own terms. If they just end up passing it back into the filesystem API and/or display it that usually works, since displaying might cause glitches but erroneous file names will cause that nevertheless so it's only a minor issue.

I hope some of this is helpful.

Edit: oh, and I guess if the app started passing these into the Windows FS API directly, bypassing SDL2, that would also cause problems. However, afaik the same would be true if in Python you obtained Win32 API poiners and started calling them, so that's an unavoidable side effect.

icculus commented 8 months ago
  • Cross-process file locking should probably be in FSops (even if it has to be voluntary because the OS doesn't enforce it).
  • File watching in FSops would be super-useful, because it's a mess of platform-specific code across several targets. Things that can't or won't support it can just return -1 (including some platforms through SDL_CreateFilesystem).

Starting to think both of these shouldn't be in FSops, but rather just SDL functions, as almost nothing will be able to implement them in any meaningful way, and the real value in both is the cross-platform abstraction.

slouken commented 8 months ago
  • Cross-process file locking should probably be in FSops (even if it has to be voluntary because the OS doesn't enforce it).
  • File watching in FSops would be super-useful, because it's a mess of platform-specific code across several targets. Things that can't or won't support it can just return -1 (including some platforms through SDL_CreateFilesystem).

Starting to think both of these shouldn't be in FSops, but rather just SDL functions, as almost nothing will be able to implement them in any meaningful way, and the real value in both is the cross-platform abstraction.

I agree.

Does it make sense to have the rest of the functions also have a non-FSops variant so people can just use them in the API without having to create an FSops object? In the normal use case, I just want to use SDL functions to access the filesystem and don't care about a custom FSops object or a base path. If we did that, we would be able to transition the internal filesystem code to use the SDL API, which would be nice for consistency.

icculus commented 8 months ago

I'm not against this; it's just a singleton FSops behind the scenes.

icculus commented 8 months ago

This is in now, in 7ac96e9a000af1756ee0dd5d8ed738ac7cd0e023.

This is actually better because it isn't a singleton behind the scenes. This skips all the path sanitizing stuff and just passes the exact paths to the platform-specific implementations. So you get cross-platform support without any surprises from helper code built on top of it.

icculus commented 8 months ago

I'm also going to rename SDL_CreateFilesystem to SDL_CreateContainedFilesystem or something, to avoid confusion.

slouken commented 8 months ago

Can we simplify the names? CreateDirectory, EnumerateDirectory, StatFile, etc. and then for the FSops version we can just prepend FS: FSCreateDirectory, FSEnumerateDirectory, etc.

Also, can you remind me again why we want the FSops? What functionality is SDL providing that someone can’t easily create in their own application (possibly wrapping around SDL’s native file system and their own pack file code?)

icculus commented 8 months ago

The benefits of having FSops is the same as RWops: a standard interface everyone agrees on to interchange functionality.

If someone wants to make a library that handles a packfile or whatever, they can just vend a FSops to the app.

The idea is we (or someone else) would eventually build out container things for consoles, or cloud services or whatever.

We can dump the built-in one, if having it isolate to a specific subtree isn't useful enough, but we should keep the FSops interface itself for external code to use.

icculus commented 8 months ago

Ugh, I dunno, now I'm looking at this again and I think I'm overcomplicating it. There's like 3 ways to do the same thing in slightly different ways, and it's not super-clear which is the one people should use.

Maybe we revisit this in 3.x if we want more, and just have these functions in 3.2.0, which solve an immediate problem (doing these necessary things in a cross platform way) ...

extern DECLSPEC int SDLCALL SDL_CreateDirectory(const char *path);
extern DECLSPEC int SDLCALL SDL_EnumerateDirectory(const char *path, SDL_EnumerateCallback cb, void *userdata);
extern DECLSPEC int SDLCALL SDL_RemoveFile(const char *path);
extern DECLSPEC int SDLCALL SDL_RenameFile(const char *oldpath, const char *newpath);
extern DECLSPEC int SDLCALL SDL_StatFile(const char *path, SDL_Stat *stat);
extern DECLSPEC void SDLCALL SDL_FileTimeToWindows(Sint64 ftime, Uint32 *low, Uint32 *high);
// plus file locking and file watching functions.

We can revive the rest from the snapshot at 7ac96e9a000af1756ee0dd5d8ed738ac7cd0e023 if we want it later.

madebr commented 8 months ago

Is https://github.com/libsdl-org/SDL/issues/9110 related to a filesystem api? When creating a directory with a relative path, it's always useful to know the working directory and/or useful to change it.

slouken commented 8 months ago

If we decide to add chdir() and getcwd() (and I'm on the fence about whether this is a good idea), we should add it to the filesystem calls above as SDL_GetCurrentWorkingDirectory() and SDL_SetCurrentWorkingDirectory().

However because the working directory varies between runs and runtime environments, and sometimes there's no access to the filesystem in a meaningful way at all (I'm looking at you, Apple TV) I kind of feel like this is a trap, and applications should always work with known paths, e.g. SDL_GetPrefPath().

icculus commented 8 months ago

However because the working directory varies between runs and runtime environments, and sometimes there's no access to the filesystem in a meaningful way at all (I'm looking at you, Apple TV) I kind of feel like this is a trap, and applications should always work with known paths, e.g. SDL_GetPrefPath().

I agree. Platforms that don't have a "current directory" in a meaningful way are going to have huge problems if games come to rely on being able to chdir to the installation or user-writable directory, or chdir for a bit into a mod's subdirectory, etc; that is difficult to clean out of an existing codebase when chdir() suddenly always returns -1. The option there is we emulate it, but then things that don't go through SDL won't work, so it just replaces one problem with another at that point.

flibitijibibo commented 8 months ago

Threw together #9227 with the hopes of adding real-world use cases for low-level filesystem APIs.

nightmareci commented 7 months ago

Will there be a means to get a guaranteed correct path separator string (e.g., SDL_GetFilePathSeparator()), so user code can construct guaranteed valid paths for functions like SDL_IOFromFile()? As of now, there's sort of a hacky way to try and do that, by checking the end of the base path or pref path strings, but that doesn't seem like a good solution. There's already some directory-related functions, so I think it's required to be able to have the current platform's path separator to be sure things will work as expected. Maybe I'm being overly pedantic though, if assuming \ for Windows and / for literally every other platform is always (not mostly!) correct; if such an assumption is exactly correct, it should probably be clearly documented somewhere (SDL_filesystem.h saying '/' on most other platforms isn't strong enough to be sure your code won't be broken on some platform).

But there's also just sweeping all that away, and always supporting at least one path separator, like /, with SDL internally converting it as necessary.

C++'s filesystem API does have the facilities to handle path separators correctly, I'd hope SDL could do the same.

flibitijibibo commented 7 months ago

Another possibility is ripping off Path.Combine which does the concatenation of paths without having to query for the path separator at all, the downside is that it's varargs so it may be a bit bulky to implement.

slouken commented 7 months ago

The SDL path separator is '/'

slouken commented 7 months ago

@icculus, I think this is done, right?

icculus commented 7 months ago

Will there be a means to get a guaranteed correct path separator string

We had a thing for this, but removed it because '/' works everywhere (including Windows!).

If we ever have to eat our words, we could add a thing at the same time as the new platform, or manage converting from '/' internally.

Windows will accept paths with either '\' or '/', including paths that use both in the same string.

I think this is done, right?

Unless @flibitijibibo has anything he wants to add, it's done. New backends for SDL_Storage can wait unless they are life-and-death (and things like, say, turning a .zip file into an SDL_Storage object, should probably always live in external libraries, even if we end up providing them, in the style of SDL_image).

The globbing stuff still hasn't landed, but that's unrelated to this issue and doesn't change SDL_Storage's interface, etc. There is also some half-finished work on file locking and file watching, but that stuff should only work on the real filesystem and won't be part of SDL_Storage, imho, if we add it at all.

flibitijibibo commented 7 months ago

Nothing else from my end! Just need to make sure all the additions work on console and FNA's filesystem needs are good to go.

TylerGlaiel commented 6 months ago

Random request if filesystem stuff is getting added into SDL, tracking when a file changes (for hot-reloading purposes) is something that requires platform-specific code (or weird hacks) to achieve normally, so some stuff in SDL to help with that would be pretty cool actually

SDL_WatchFileForChanges(const char *path);

and then some SDL_EVENT_FILECHANGED to get notified when the file is written

icculus commented 6 months ago

This is on the wishlist, but I got pulled onto other things before I finished it.

icculus commented 5 months ago

Nothing else from my end! Just need to make sure all the additions work on console and FNA's filesystem needs are good to go.

Want to close this and move further discussion back to #6644 so we don't have two competing issues here. Is this good to go on your end now, Ethan?

flibitijibibo commented 5 months ago

Yep, should be good to go - before freezing we should definitely make sure PlayStation works; Xbox and Switch should be fine as-is.