9001 / copyparty

Portable file server with accelerated resumable uploads, dedup, WebDAV, FTP, TFTP, zeroconf, media indexer, thumbnails++ all in one file, no deps
MIT License
571 stars 34 forks source link

Considerations for dirkeys (like filekeys but for directories...) #64

Closed Gremious closed 6 months ago

Gremious commented 10 months ago

Very simple thought, though it might be too much of a pain to implement idk:

I can share https:/myserver.com/photos/friends-only-album/image.jpg?k=1234, where https:/myserver.com/photos/friends-only-album/ goes to a 403 because there's only G permissions set for *.

it would be cool to be able to do https:/myserver.com/photos/friends-only-album/?k=1234 and share a whole folder

Then if a random person goes to https:/myserver.com/photos/friends-only-album/ it will 403, but I can selectively share it with trusted people.

Thoughts:

  1. I imagine if you share a folder, people should be able to open and view the files there - which will probably reveal the filekeys and let said people share those. This is probably fine cause if you're sharing something to anyone, you're supposed to trust them anyway.
  2. It will probably do the same for subdirectories, so ig be careful to not put sensitive file nested deep on some global shareable directory.
  3. If a new file is uploaded there, I guess the key should regenerate? But then that's a pain if it's a folder that is expected to have updates, so probably optional.
9001 commented 10 months ago

Hmm, neat idea! But a bit tricky yeah :)

it could work, but there would be some limitations (which might actually be an advantage) --

I'll probably get this added some time around newyears so that gives us time to think about it :+1:

Gremious commented 10 months ago

the right dirkey, folders would be viewable in the browser, almost exactly as if you have full read-access

makes sense

however, downloading folders as zip-files using a dirkey will not be possible; this will just be too tricky to get right...

Where does the difficulty come from here? I kind of assumed if you can reach and press the zip button, it would just download your selection.

the ability to see dirkeys of subfolders could be configurable, and possibly default-disabled

Does this mean that while you can see the folders, you potentially cannot access them?

since that would be too computationally expensive for the server

Ah, is it because if you e.g. upload 50 files into a folder, it would re-generate a dirkey every single upload? Or is there something else to it?

9001 commented 10 months ago

Does this mean that while you can see the folders, you potentially cannot access them?

That was the idea, but now that you're asking, I guess it would make more sense to just hide all subfolders unless you enable the feature to also allow allow access to subfolders.

Where does the difficulty come from here? I kind of assumed if you can reach and press the zip button, it would just download your selection.

It's thanks to how things are implemented -- when accessing a folder through the web-interface, the browser sends off a request to the webserver component, which checks if the user is allowed access that specific folder (either thanks to account permissions, or by filekey / dirkey). However, when downloading a selection or folder as a zip-file, the webserver instead tells the VFS (the component that's in charge of volumes and permissions) to scan through all subfolders and return a list of all the files it finds, and the VFS keeps making sure that the user has access to view each subfolder it enters.

To add support for dirkeys in there, the webserver would have to tell the VFS that "you're free to skip permission checks below this URL, as long as you don't cross over to another volume!" , which would be a tiny bit of a performance drag, but mainly i'm worried about the opportunity it gives for new and exciting bugs :no_mouth:

Ah, is it because if you e.g. upload 50 files into a folder, it would re-generate a dirkey every single upload? Or is there something else to it?

in a way it's even worse than that :p

There's two ways to handle filekeys; the obvious one would be to store the keys in the database, and then read them from there whenever necessary, but it turned out it was both easier and faster to just regenerate the keys every time they were necessary. Easier, because there's no need to access the database when retrieving files (which can get blocked by a busy upload on a slow HDD), and especially when receiving uploads (all the components which are able to add files to the filesystem/database would also have to update the filekeys database accordingly) -- and it just made it easier to avoid any accidental performance impact for anyone who don't want to use the feature. And faster, because the filekey calculation is simple enough that scanning through the database was actually slower. But that said, it's already using the database for music tags (when that's enabled), so it's not super-bad -- it would, however, have the unfortunate consequence that a file might suddenly become impossible to download for a while, if the database if busy processing uploads. Right now, it just skips listing the music tags in the folder if that happens.

For dirkeys, assuming it would be done with the same approach as filekeys, it would have to generate dirkeys every time someone visits a folder, and (probably) do that by listing the contents of each subfolder to generate the key, which is going to be fairly taxing on both the HDD and the CPU -- unless it would be safe to do key calculation just based on the folder's last-modified timestamp or something, but I'm not sure whether that is going to be secure enough... And by putting dirkeys in the database, I'm guessing each upload/move/delete would be almost twice as slow. But that's just a hunch!

Gremious commented 10 months ago

mainly i'm worried about the opportunity it gives for new and exciting bugs 😶

Hmm, I don't necessarily see anything new that sounds easy to break, but I am entirely ignorant of what the codebase is like, so honestly I trust your judgement here.

"you're free to skip permission checks below this URL, as long as you don't cross over to another volume!"

Cause yeah, to me it sounds samey? Like it will scan through all subfolders and return the same list of all the files it finds, and the VFS can continue re-checking access as normal, just by basically asking if there's a also a dirkey for this directory first, rather than filekey or w/e. But it's probably not that simple...

Actually, also, you said specificaally "Downloading folders as zip-files using a dirkey will not be possible" - is regular selection/file-by-file fine? Or is it all scuffed?


(probably) do that by listing the contents of each subfolder to generate the key

Question, by default, filekeys are generated based on salt + filesystem-path + file-size + inode (if not windows) - what is used on Windows? Same except no inode?

unless it would be safe to do key calculation just based on the folder's last-modified timestamp or something, but I'm not sure whether that is going to be secure enough..

If it was just time (+ path + size) , then it's probably easy to break, especially since uhh

image

copyparty just kinda shows you that information lol. All you'd need to do is is spin up copyparty yourself and make a volume with the same path holding a txt file that size and modification date.

But, with the addition of salt + inode + and if you really want to, some other more-unique identifier that is unknowable to users unless you were genuinely breached (Device number? Machine id? Partition GUID? All of the above?) - I think it's pretty reasonable to use.

(And if some user doesn't trust that still, idk, don't use dirkeys, make ppl accounts or send them files directly via matrix or something. Or maybe find something more security focused than copyparty, I don't think we make super hard security guarantees, it's just nice file host).

9001 commented 9 months ago

sorry, this one isn't working out -- the javascript is proving to be too much of a challenge... there's just too many assumptions that just fall apart once you start passing around dirkeys, since folders never had any additional parameters until now.

I'm fairly confident that everything serverside went alright enough however; the only issue there is the approach to getting thumbnails to work, which is kinda iffy.

on the bright side, seeing all of the changes so far, it's probably a good thing it had to be abandoned -- would have made it harder to maintain stuff down the road.

going to stash the unfinished stuff in another branch just in case!

Question, by default, filekeys are generated based on salt + filesystem-path + file-size + inode (if not windows) - what is used on Windows? Same except no inode?

yup, so the biggest defense there is the salt, which should be alright.

I don't think we make super hard security guarantees, it's just a nice file host).

granted, it's probably missing some of the best-practices regarding general web-app security, but everything should be fully secure regarding unintended access and such... or at least that's the intention :>

9001 commented 6 months ago

a couple more people have been asking for this as well, so gonna give it another shot --

should work for the most part as of 32c912bb16452e7c65e61c5d11b50db51a0aa4b8, so there's a beta build available here: https://ocv.me/copyparty/beta/copyparty-sfx.py

EDIT: also on dockerhub; copyparty/ac:dk

there might be some bugs still, but that should all be on the client-side, so just javascript exceptions and the UI breaking, stuff like that -- let me know if you hit anything unexpected :+1: