File security makes no sense, needs explaining.

mikeyc7m commented 6 years ago

Affected Version

SS 4.2

Description

The new process for file assets is confusing. The documentation about it is overly technical and makes no sense anyway. Someone needs to review the process and update the interfaces and documentation to better explain it.

What we already know is that a page is protected as a "draft" version until it is "published" by adding it to the "live" stage. Even when published, it has permission controls on who can view & edit it. Sweet as. But... Now we have files that can also be in draft and live. So far sounds good eh! But then we have this additional "Publish" capability that removes permission checks and dumps the file in a public folder. Whaaat...? When I'm in the CMS and I browse the file assets, there is a button to "Publish" the file. But what does it do? Does it add to live stage - or does it move to the public folder?

Steps to Reproduce

What do each of these mean for files? writeToStage("Stage") writeToStage("Live") publishFile() protectFile() grantFile() revokeFile()

I have two scenarios, please explain which functions I should use:

PDF files that belong to specific users which should only be accessed by them when they are logged in.
PDF files that belong on the website for anyone to read.

Also, can both scenarios be achieved using only the CMS interface? If so, it needs better labelling.

tractorcow commented 6 years ago

Ah right, yes it is a lot of complex code, but the business logic should be reasonably easy to explain.

If an anonymous user canView() = true, then the file should be public. Otherwise, it should be protected.

This logic is handled by the AssetControlExtension, which does the checking on write, regardless of which stage is being written. What is important is the visibility (canView) of the record after each of the above methods are invoked.

grantFile() just lets the current user read the file, even if protected. There's no real need for revokeFile() TBH.

publish/protectFile() should not be called by user code, but is intended to be invoked by the AssetControlExtension.

In your above example, the only thing you should do is implement a canView() that implements the rules you've described above.

Don't forget that you need to publish files if they are public too; CanView = true won't override the ORM hiding the record on live mode. :D

mikeyc7m commented 6 years ago

um ok... so...

"Publish" still means moving from stage to live.
The function publishFile() is badly named, it should be exposeFile() or unprotectFile() or publicFile() or something less confusing.
Exposing public files is handled... by magic? I still don't get it. Does it dump a file in the public folder when someone is allowed to view it, and then delete the file when they log out? What if someone else tries to see it before it is deleted? I thought maybe it was for assets that had no need for any security?

tractorcow commented 6 years ago

"Publish" still means moving from stage to live.

Yes.

The function publishFile() is badly named, it should be exposeFile() or unprotectFile() or publicFile() or something less confusing.

Absolutely. It should be "make public" or "make protected". A published file can still be protected if it has extra canView() logic above and beyond versioned staging.

Exposing public files is handled... by magic?

The logic is based NOT on individual users, but rather a single "anonymous" user. It is not user specific, it's file specific.

The critical line is in AssetControlExtension::getRecordState()

// Check if canView permits anonymous viewers
return Member::actAs(null, function () use ($record) {
    return $record->canView()
        ? AssetManipulationList::STATE_PUBLIC
        : AssetManipulationList::STATE_PROTECTED;
});

This temporarily logs out any current user, and checks if canView() is true.

If the file is protected, then it will require a session value to be set in order for the current user to view it. That session value is triggered by getURL() on the given file, which invokes grantFile().

That 'magically' permits per-session access to those protected files.

tractorcow commented 6 years ago

actually, @mikeyc7m check this out. This is the original devlist discussion back from when we were brainstorming this feature.

https://groups.google.com/forum/#!searchin/silverstripe-dev/assets|sort:date/silverstripe-dev/PihYhP3ZTN8/BiGDMsfuBAAJ

Maybe younger Damian can explain it better to you than old Damian.

maxime-rainville commented 6 years ago

We got a job coming up to rework how we import files https://github.com/silverstripe/silverstripe-assets/issues/175

This might be a good opportunity to update the doc about how we manage file at the same time.

chillu commented 5 years ago

Yeah, I've been involved in the discussion over the years, and still feel like I only understand about half of what's going on - we've created quite a structure there. These two docs need to improve, and become more use-case and example focused:

https://docs.silverstripe.org/en/4/developer_guides/files/file_migration/ https://docs.silverstripe.org/en/4/developer_guides/files/file_security/

Developers can limit canView() on assets via code. Authors can limit view/edit permissions via the CMS UI. Authors can publish files explicitly in the Asset UI. Authors can publish files implicitly by making them part of a cascading publish. Publishing a file no longer enforces any protections, even though both the code and view/edit permissions in the CMS UI imply that they would. That's a problem, right? I know why we can't enforce those permissions on published files (performance), but we've basically created multiple footguns here.

chillu commented 5 years ago

@clarkepaul Can you have a think on the UX side of this?

~UX Problem A: We allow locking down permissions on a published file (access control has no effect)~ ~UX Problem B: We allow locking down permissions on a folder, but don't warn when it contains already published files (access control has no effect)~ ~UX Problem C: We allow publishing a file with existing access control directly (through asset admin)~ UX Problem D: We allow embedding an access controlled file into other content without declaring it as such ~UX Problem E: We allow publishing a file with existing access control indirectly (through cascading publish)~ ~UX Problem F: When custom code-level access control is implemented, there's no way for developers to prevent authors from publishing files (access control has no effect)~

Update July 2019: OK, misunderstood this, a file can be both published and protected. I've since improved the explanation on the docs. The UX Problem D is an issue, captured in https://github.com/silverstripe/silverstripe-assets/issues/220

It makes me tired just thinking how much effort will go into discussing and implementing fixes for this :(

clarkepaul commented 5 years ago

@newleeland I've added this to our backlog for you to look into. Cheers

chillu commented 5 years ago

Discussed this a bit more with @newleeland. Maybe we should start with some use cases:

Use Case 1: Authors upload file drafts to preview them alongside content drafts, before they publish both at the same time. Only other authors should have access to those file drafts. Current Status: Works as-is
Use Case 2: Editing of files is limited to certain groups of CMS users, regardless of their draft/published state. Current Status: Works as-is
Use Case 3: Developers limit the view permissions of files to certain users outside of the CMS (e.g. "download your financial statement"). Current Status: Broken because CMS users can just publish those files.
Use Case 4: Admins limit the view permissions of files for other CMS users. Very rare? Current Status: ~Broken for the same reasons as Use Case 3.~ Works as-is

The underlying issue here is that we can't separate file publication from file access control. Published files are accessible to everyone who knows the URL. This is because webservers need to serve these files quickly (without going through PHP), which is a constraint that we can't change globally (e.g. routing every file through PHP).

Option 1:

Only allow admins to set "view permissions" (due to large potential impact and level of understanding required)
Add "custom" to options in "view permissions" (defined in code)
Mark folders and files with limited view permissions in the UI
Don't allow selection of folders and files with limited view permissions for content embedding (embed dialog and linking as relationship). This will be tricky to get right.
Throw exception when File->canView(<no user>) returns false on File->publish() (unclear how we'd distinguish that from draft mode)

Option 2:

Remove UI controls for "view permissions" on files and folders (but retain UI controls for "edit permissions" on files and folders)
Add a way for developers to "hard protect" certain folders via path configuration in code.
Don't allow selection of folders and files with "hard protect" view permissions for content embedding (embed dialog and linking as relationship). This will be tricky to get right.
Throw exception when File->canView(<no user>) returns false on File->publish() (unclear how we'd distinguish that from draft mode)
Figure out how you'd upload files into "hard protected" folders without an upload UI

Option 3:

Remove File->canView() database permission support
Remove File->canView() extension hooks. Add a big fat warning in File->canView() that it's only meant as an internal tool to control draft content

Option 4:

Keep functionality, maintain some kind of whitelist of file paths which need to be routed through PHP. Would need to be in the webserver layer (e.g. .htaccess), which would be hard to implement on webservers who encode their routing in global config (e.g. nginx). Assuming that routing every file through a PHP process is out of the question, regardless on how lightweight we make that process of checking a whitelist. It'll still be orders of magnitude slower than a static response (e.g. from a CDN)

Regardless of the option, we should allow devs to run a report of files and folders which are published but have canView database protections applied (which need to be fixed manually). I'm tending towards Option 3, because I just don't see this as a high value feature (view permissions beyond draft files).

@tractorcow @silverstripe/core-team This would be removing a core feature (at least the user facing part), so keen to have your input on this.

newleeland commented 5 years ago

A use case that we would also need to consider is what will happen when an access controlled page when is published with a file within it. Is the file protected or public?

chillu commented 5 years ago

A use case that we would also need to consider is what will happen when an access controlled page when is published with a file within it. Is the file protected or public?

That's undefined behaviour as far as I'm concerned. The concept of ownership so far does not extend to visibility constraints. We are using the concept of "exclusive ownership" in other contexts already, but frankly I think no author would be able to understand what happens if we would apply this to visibility constraints as well

robbieaverill commented 5 years ago

I've only had a quick skim read, but I think I'd vote for option 3 or option 4. Option 3 seems like the best solution if we don't want to route assets through PHP, whereas Option 4 seems best to ensure that the functionality works.

Regarding things being complex for nginx users - it already is since we've implemented a dynamically generated htaccess file for Apache. That doesn't change that much though, so it's less of a pain

clarkepaul commented 5 years ago

Use Case 2: Editing of files is limited to certain groups of CMS users, regardless of their draft/published state. Current Status: Works as-is

Use Case 4: Admins limit the view permissions of files for other CMS users. Very rare? Current Status: Broken for the same reasons as Use Case 3.

Can we simplify these two items to be one unified item. For example "Who can view, edit and use this file within the CMS?" [insert nicer label here]. If someone can edit they can obviously view, one permission relies on the other being set anyway. It would also remove any ambiguity of whether people can use readonly images, they would have full access or no access to it.

Use Case 3: Developers limit the view permissions of files to certain users outside of the CMS (e.g. "download your financial statement"). Current Status: Broken because CMS users can just publish those files.

Why is this setting being removed when a file is published? this setting should only be relevant to published files and not tied to publishing (it shouldn't do anything to draft files unless it gets published then the permissions kick in). This is a real use case that I would have thought would be more importantly set by Admins but could also apply to developers. It feels like something we should be able to cater to just as we do with pages. The label could be "Who can view the published file on the site?".

If I understand correctly this functionality is too hard to fix?, then I'd be up for removing the UI controls "Who can view this file".

Sounds like opt. 2 or 3 would be the best options, other than actually fixing the functionality (opt 4) which I didn't really understand. Pretty confusing topic so hopefully I got the gist of it.

chillu commented 5 years ago

Can we simplify these two items to be one unified item. For example "Who can view, edit and use this file within the CMS?" [insert nicer label here].

Not if we implement Option 3, in which case "can view" wouldn't be configurable through the UI. The "can edit" case would remain as-is. I'm not sure how "use" is different from "view", do you mean "embed in content" and "attach as relationship"?

Why is this setting being removed when a file is published? this setting should only be relevant to published files and not tied to publishing

The constraint here is technical: We can't check every published file for permissions before serving it, and even determining if a file needs to be checked is complex (webserver-level whitelists). We'd drastically increase the hosting requirements for every SilverStripe site, break every CDN implementation out there, and make it a worse user experience due to slower response times. The use example case of "download your financial statement" would still be possible, but only through custom code - e.g. in your particular project, devs declare that the assets/statements folder receives the same treatment as assets/.protected (draft files)

OK, it sounds like we have three votes for Option 3. @silverstripe/core-team Any vetoes?

sminnee commented 5 years ago

Come late to this - need to read a bit more carefully but it sounds like we’re ditching private, punished file support which is alarming

chillu commented 5 years ago

I think we can find a compromise and retain it for certain folders, which then get re-routed via .htaccess. I don't see how we can make every published file in assets/ a "potentially access protected file" without significant loss of overall website performance (and cachability). The "hottest" path there would be assets/_combinedfiles, but also logos and other "default website assets" uploaded to assets (e.g. for a Watea+ style theme). It would change a fairly fundamental infrastructure assumption, right?

And I wonder if we need to support private published files in core. Can't we leave this to a module, which then instructs users how to alter .htaccess (or hooks into auto generation), and adds a UI for File->canView()? This is a fairly sensitive implementation, and a bit of a fringe use case. How many SilverStripe sites do you know with access controlled published files? Also keep in mind that there's much more solid and scaleable ways to achieve this if you're building some kind of self-service UI on SilverStripe, e.g. AWS S3 Pre-Signed URLs.

sminnee commented 5 years ago

Use Case 3: Developers limit the view permissions of files to certain users outside of the CMS (e.g. "download your financial statement").

I'm not sure that this is a valid use-case; if the file was system generated it would make more sense to generate a DBFile rather than a File object. However, it might happen. But my next comment is more important:

Current Status: Broken because CMS users can just publish those files.

This isn't the case. Only public, published files are put into the unprotected bucket, or at least that was the design (I'd need to check the code). Users would have to go to the permissions section (and have access to change permissions). Protected, published files are treated the same way as draft files.

I don't see how we can make every published file in assets/ a "potentially access protected file" without significant loss of overall website performance (and cachability).

It exists now, so I'm confused as to why you think it's impossible. Does this relate to the need to restore persistent URLs? I can see how that will cause some issues but it will also cause issues for draft pages. To that end, I would recommend that only exposed files get persistent URLs, and draft & private files have the SHAs in their names.

and a bit of a fringe use case. How many SilverStripe sites do you know with access controlled published files?

Any intranet.

This is a fairly sensitive implementation

I can't see which issues will be difficult to solve that we won't have to solve for draft files. Also, we've already done it. It's in production.

sminnee commented 5 years ago

The function publishFile() is badly named, it should be exposeFile() or unprotectFile() or publicFile() or something less confusing.

Agree with this. Given that it confuses much of the core team, I'd make it a priority to rename.

sminnee commented 5 years ago

A few notes:

ProtectedFileController handles access to protected assets, both draft assets and private one
FlysystemAssetStore::getResponseFor() is the AssetStoreRouter that actually does the hard work
FlysystemAssetStore::grant() and FlysystemAssetStore::revoke() are used to control access to protected assets, by putting a list of accessible files in the session
FlysystemAssetStore::isGranted($fileID) checks this

So if assets are being inappropriately exposed, it's probably that grant() is being called when it shouldn't.

This is the case right now, in FileShortcodeProvider: https://github.com/silverstripe/silverstripe-assets/blob/1/src/Shortcodes/FileShortcodeProvider.php#L72

I think that line should only grant access to files that are viewable by the current user. That risks broken images, but not security violations.

sminnee commented 5 years ago

To be honest I can't remember if & how direct access to protected files works – that is, deep-link to a protected File, such as a PDF download.

@tractorcow can you remember where we got to on that>?

chillu commented 5 years ago

Sam is correct in that my assessment of Use Case 3 was wrong. I've updated my comment. If you publish a file with viewing permissions, it stays in assets/.protected. See test case on 4 branch below. I believe we still have an issue with File->canView() extensions (Use Case 4). This should be fixed by a subset of Option 3 (big fat warnings on extension hooks).

Login as Administrator, upload file.pdf
- File in assets/.protected/file.pdf
- Access without login - Not found
- Access as Administrator through link - Not found
- Access as Administrator through link after viewing admin/ - Success
- Access as Content Author through link - Not found
- Access as Content Author through link after viewing admin/ - Success
Limit permissions to Administrators
- File in assets/.protected/file.pdf
- Access without login - Not found
- Access as Administrator through link - Not found
- Access as Administrator through link after viewing admin/ - Success
- Access as Content Author through link - Not found
- Access as Content Author through link after viewing admin/ - Not found
Publish file
- File in assets/.protected/file.pdf
- Access without login - Not found
- Access as Administrator through link - Not found
- Access as Administrator through link after viewing admin/ - Success
- Access as Content Author through link - Not found
- Access as Content Author through link after viewing admin/ - Not found
Remove permission limitations, save
- File in assets/.protected/file.pdf
- Access without login - Success
- Access as Administrator through link - Success
- Access as Administrator through link after viewing admin/ - Success
- Access as Content Author through link - Not found
- Access as Content Author through link after viewing admin/ - Success
Publish file again
- File in assets/file.pdf
- Access without login - Success
- Access as Administrator through link - Success
- Access as Administrator through link after viewing admin/ - Success
- Access as Content Author through link - Success
- Access as Content Author through link after viewing admin/ - Success

sminnee commented 5 years ago

OK I've split out what I see to be an important but quite small bug to fix here: https://github.com/silverstripe/silverstripe-assets/issues/220

I believe we still have an issue with File->canView() extensions (Use Case 4).

As long as the bug I just mentioned is fixed, I think you only have an issue in that, if you add custom File::canView() code after files have been published, the publicly-accessible assets will remain.

Notably, AssetControlExtension::getRecordState() is used to decide whether to make the asset publicly available, and it calls canView() for the Security::getCurrentUser() == null state. So modifications to canView() will be incorporated.

sminnee commented 5 years ago

There's a few situations that need to be documented better:

If you create a File::canView() that non-deterministically returns true for anonymous users (which sounds far-fetched but might happen for example for certain implementations of an expires module), you're going to have a bad time — files might get left exposed in the open. So, implementing expiry as a scheduled unpublish would be much better than a time-dependent can view.
If you introduce new File::canView() logic to an existing site that protects previously publicly accessible files, you're going to have a bad time. Probably some kind of "reset file protection status" build-task would be useful that you could run after doing this, that published or protected each file based on the AssetControlExtension::getRecordState() of each File.

Most of this is documentation, but the creation of a new build task is code.

sminnee commented 5 years ago

Another one:

You can't deep-link to a protected asset, as access to the file won't have been "granted". I think we should document this as a known bug until we get some external drivers to fix it. I'll write up a separate bug ticket for that that we can link to from the docs.

I've listed this as an enhancement issue here https://github.com/silverstripe/silverstripe-assets/issues/221

kinglozzer commented 5 years ago

Just adding another observation to this, currently File::canView() may return false for files that are owned by the current member. It seems to inspect the parent folder’s permissions, I can’t spot anything that inspects the OwnerID column.

Was going to open an issue on the assets repo for this but figured I may as well include it here if we’re discussing amends to File::canView()

chillu commented 5 years ago

I've attempted to clarify the current behaviour in https://github.com/silverstripe/silverstripe-framework/pull/9002, since even core developers get confused by the logic. This is impacting our ability to assess and resolve potential security issues, see https://github.com/silverstripe-security/security-issues/issues/48

chillu commented 5 years ago

Just adding another observation to this, currently File::canView() may return false for files that are owned by the current member. It seems to inspect the parent folder’s permissions, I can’t spot anything that inspects the OwnerID column.

I think that's correct behaviour. The permission model is already complex enough without also creating assumptions around "owner always has access". I generally think of ownership more as a way to manage content over time (e.g. reports in multi-author environments)

kinglozzer commented 5 years ago

I think that's correct behaviour. The permission model is already complex enough without also creating assumptions around "owner always has access". I generally think of ownership more as a way to manage content over time (e.g. reports in multi-author environments)

So OwnerID is more like a record of who uploaded the file, rather than meaning that the file “belongs” to the user?

chillu commented 5 years ago

Yes, that's my understanding of OwnerID. The property is described as @property int $OwnerID ID of Member who owns the file

silverstripe / silverstripe-framework