m2ms / fragalysis-frontend

The React, Redux frontend built by webpack
Other
1 stars 1 forks source link

Authenticate all b/e endpoints #1247

Open phraenquex opened 10 months ago

phraenquex commented 10 months ago

Clean-up task

alanbchristie commented 9 months ago

Changes can be made but with #1264 unresolved if we fix this fault (#1247) the we won't be able to upload any data.

I could introduce a local map of users and proposals but that's a "hack". Essentially #1264 is emerging as a blocking issue.

phraenquex commented 9 months ago

@alanbchristie @Waztom will this lock down https://fragalysis.diamond.ac.uk/api as well? (Do we need it locked down?)

alanbchristie commented 9 months ago

Not sure what you mean, https://fragalysis.diamond.ac.uk/api is a built-in django/REST framework endpoint so we can't (easily) change its behaviour. Seeing the API is open to all. We can lockdown specific endpoints but they must be dealt with on a case-by-case basis?

...and this still has to be done.

phraenquex commented 9 months ago

@alanbchristie: Warren confirms this is a glaring data security breach waiting to happen - the backdoor not just unlocked, but propped wide open. We're only missing the 3-story neon signs.

@Waztom will spec up what's needed here.

Waztom commented 9 months ago

@alanbchristie when I had a look - mistakenly thought I could see some private target data via https://fragalysis.diamond.ac.uk/api. No need to panic, but we should still review, especially after the model changes to V2.

phraenquex commented 9 months ago

@Waztom it's staging stack you should be checking, isn't it?

@alanbchristie what protections are there against (a) looking at the code to work out how to use the API and (b) running queries for protected targets?

I guess if there's no way for a non-authorised person to know either target of target_access_string, then that's a somewhat robust protection for the time being; but it's not medium-term robust.

mwinokan commented 7 months ago

@Waztom or @mwinokan to check if all endpoints are authenticated. Is there a list of endpoints somewhere?

mwinokan commented 4 months ago

Waiting on @Waztom to get a lay of the land in #1414

mwinokan commented 4 months ago

@alanbchristie & @tdudgeon please start securing the endpoints that are easy to fix (regardless of importance), and document difficult endpoints for @phraenquex to review.

alanbchristie commented 4 months ago

An initial estimate of API authentication complexity is attached in this PDF document: -

I think it's important to clarify the goal of this task.

I assume this is not about adding authentication to every endpoint (that would require everyone to login before any endpoint could be used) as that would alter the out-of-the-box behaviour of the app, especially for 'open' proposals. Instead I assume this is about making sure people cannot create or access material relating to 'closed' proposals (i.e. as declared in issue 1414).

phraenquex commented 4 months ago

Thanks @alanbchristie.

Correct: it's about making sure closed data is closed, only accessible with authentication.

(I'm not aware of any risks with keeping the endpoints open - @Waztom please confirm that this is your understanding too. If anybody does raise a risk, we'd have to crack open an entire new workstream - many new tickets, presumably.)

phraenquex commented 4 months ago

@matteoferla @Waztom please poke at those endpoints that Allan labelled as complex, and say if any of them create problems.

I predict https://fragalysis.xchem.diamond.ac.uk/api/siteobservation_tag/ needs to be fixed; I can't judge the others.

alanbchristie commented 4 months ago

Having worked through a number of endpoints I have adjusted the original spreadsheet. This shows API endpoints that have been adjusted (those with comments in the Actions column).

I am especially nervous of wholesale changes like this because: -

Rather than merging this down - it might be worth testing a specific build, simply to verify that it hasn't fundamentally broken the application. Rolling the stuff back wil be more complicated than simply not using it in the first place.

Changes

Here's what I've been doing: -

  1. Produced a new permissions class (IsObjectProposalMember)
  2. Adjusted the security model (new user_is_member_of_any_given_proposals() method)
  3. Adjusted API endpoints (the ones that look simple)
  4. Refactored viewer.views (to improve code style and consistency)

Here's the pattern adopted for point 3: -

Migrating a ModelViewSet

Replace the view's ModelViewSet base class with the following: -

    mixins.UpdateModelMixin,
    mixins.CreateModelMixin,
    mixins.DestroyModelMixin,
    ISPyBSafeQuerySet,

You also need to adjust or set the string value of a filter_permissions view member. This must provide a path to the object's project record. The project in the final object can be of type ForeignKey or ManyToManyField. Example: -

  filter_permissions = "target__project_id"

And then add or replace the permission_classes member of the view and set it to the following value: -

    permission_classes = [IsObjectProposalMember]

Migrating a ReadOnlyModelViewSet

The pattern for change (for a ReadOnlyModelViewSet) is to replace the view's base class with the following: -

    ISPyBSafeQuerySet

You also need to adjust or set the string value of a filter_permissions view member. This must provide a path to the object's project record. The project in the final object can be of type ForeignKey or ManyToManyField. Example: -

  filter_permissions = "target__project_id"
mwinokan commented 4 months ago

@alanbchristie says that some schema changes would simplify adding project-specific permissions to endpoints.

For multiple endpoints e.g. siteobservation_tag where the route to the target is complicated, there may be a performance cost to obtaining the target unless it is added to the table.

@alanbchristie please work with @kaliif to review and make the schema changes before we upload the new targets for the green release #1456

Schema changes should be considered part of the green release, with less drastic changes being part of mint

alanbchristie commented 3 months ago

We have made some progress but overlooked the 'create' path to objects. Consequently there is still work to be done here, specifically on a number of Serializers that can be used to trap un-authrorised creates. An new spreadsheet/pdf is attached that highlights the current security state of the endpoints (the "Secure (#1247)" column).

Where security is still an issue a "note", visible in the original spreadsheet (which is sadly MacOS-specific) explains any potentially blocking issue.

fragalysis-api-security.pdf

Origin of security flaw

Fundamentally this is all attributed to a potential oversight and is largly now considered "technical Debt". Although the ISPyB security module was introduced at the project outset this was only ever used to prevent users seeing restricted content.

Over time, with developers changing some endpoints neglected to implement the same security and where 'write' security was added it was added to the crucial endpoints where new data was uploaded. Sadly an "unscrupulous" user could still access many endpoints and either create material or get it.

This effort should close all the exiting endpoints and introduce a pattern to avoid any occurrences.

Risks

With careful analysis of each endpoint, which we will re-review prior to commit ing the changes to the main branch, the risk of exposing secure material is extremely low.

Cluster Access Obviously anyone with access to the Kubernetes cluster (and therefore the database) will have access to all the material.

Cloud Access Also, we operate backups that are written to an NFS volume in STFC, so secure material could be obtained by an unscrupulous user who may have access to the volume.

phraenquex commented 3 months ago

@Waztom: @alanbchristie needs you for 30 minutes to go through the list.

alanbchristie commented 3 months ago

The test code adds new environment variables to over-ride/force private proposals and membership of those proposals. A comma-separated list of proposals is provided in TEST_RESTRICTED_TAS and a comma-separated list of usernames in TEST_RESTRICTED_USERS.

Any proposals in TEST_RESTRICTED_TAS are removed from any that would normally be considered public.

Any users in TEST_RESTRICTED_USERS will have whatever proposals are returned by ISPyB plus any proposals in TEST_RESTRICTED_TAS.

So, to turn the exiting public proposal lb18145-1 into a private proposal only achristie can see/edit set the following in the Stack's StatefulSet: -

            - name: TEST_RESTRICTED_TAS
              value: lb18145-1
            - name: TEST_RESTRICTED_USERS
              value: achristie
alanbchristie commented 3 months ago

After working with Warren we have made significant progress and are now down to one endpoint (I think) and a number of 'task query' endpoints. Latest spreadsheet Excel/PDF attached.

fragalysis-api-security.xlsx

fragalysis-api-security.pdf

mwinokan commented 3 months ago

@alanbchristie's summary from the meeting:

  1. one low risk endpoint (dicttocsv), no sensitive data can be extracted but there is a risk of the hard drive being polluted by malicious users. Currently used by the frontend but if necessary @boriskovar-m2ms says it would only take a few hours of dev to recreate this functionality on the frontend. Spun out to #1474
  2. four related issues when monitoring tasks, if the UUID4 can be guessed/obtained then the logs may reveal sensitive information. The danger is that compound smiles could be extracted. Celery will cache the task logs for four days (?). Can we associate the task with a specific target and then use the standard target authentication? @kaliif please investigate if the target can be associated from the task (target name may need to be passed to the celery task)
  3. the XCDB/fragspect endpoint is insecure but XCDB as a whole is not used so it can be axed, @alanbchristie to remove the module

This ticket was green because it might cause breaking DB changes, but @kaliif says there will be changes (new tables) but they will not be breaking and completed soon (day or two).

All the endpoints need to be tested again. @alanbchristie will create a new image and shout to @mwinokan and @Waztom for testing

mwinokan commented 3 months ago

2024-07-10 testing on Alan's stack

All the below were attempted (checked means it worked)

alanbchristie commented 3 months ago

We're down to one 'task' endpoint, an endpoint that is extremely difficult to use and only poses a risk to exposing logs if the task ID (a UUID4) is shared or obtained somehow. An automated attack on the endpoint will have little or no effect.

Today's spreadsheet and PDF attached.

fragalysis-api-security.xlsx

fragalysis-api-security.pdf

mwinokan commented 3 months ago

@alanbchristie recommends we do some testing with a faux private target. Alan says he can go into the database and modify the permissions to verify that @mwinokan can't perform unauthorised actions.

alanbchristie commented 3 months ago

By using the aforementioned new environment variables (which must be removed prior to deploying to staging) we have confirmed that access is limited.

mwinokan commented 3 months ago

2024-07-12 testing:

Logged in as tfb64483:

mwinokan commented 3 months ago

I was unable to fully test the latest changes (which fix the 500 errors) as the CAS authentication is not working on the dev clusters. @alanbchristie says that staging is working so we might want to push the changes and test there.

@alanbchristie please merge up to staging so that we can test there

phraenquex commented 3 months ago

@alanbchristie to upload latest status, @phraenquex to update labelling to fully reflect non-risk.

alanbchristie commented 3 months ago

The latest API state, with only /viewer/update_task posing some minor risk.

fragalysis-api-security.pdf

Waztom commented 3 months ago
Waztom commented 3 months ago

@alanbchristie please use this endpoint to test if the RHS data has been successfully uploaded => https://fragalysis-alan-default.xchem-dev.diamond.ac.uk/api/compound-molecules/

alanbchristie commented 3 months ago

The existing (staging) stack does not suffer from the compound-molecules problem because its view has no ISPyB-based filtering, so we are able to see all the ComputedMolecule records.

In the security-fixed stack (Alan's) filtering is being used and it has highlighted a problem with the content of the model records for the upload ... the ComputedMolecule -> Compound does not have its project_id (a ManyToManyField) populated. So filtering blocks all the ComputedMolecule records as there is no Project to find.

mwinokan commented 2 months ago

@alanbchristie will add another column to the endpoint spreadsheet so that we can finalise the R/W permissions spec for both open and closed projects

alanbchristie commented 2 months ago

The following spreadsheet provides the supported Read/Create/Modify/Delete endpoint options for "PUBLIC" Proposals/Sessions. Access to non-public targets is generally restricted to those who are part of the corresponding session.

fragalysis-api-security.xlsx

fragalysis-api-security.pdf

Notes

Public Target Access API Modes (for “Public” Proposals/Sessions)

The following is the key for the Read, Create, Modify and Delete columns

Mode Restrictions
OPEN No authentication is required. The user does not need to be logged in.
AUTH User needs to be logged in but is not required to be a member of the object’s Proposal/Session
SESS User must be logged in and must be a member of the object’s Proposal/Session
X This mode is not supported
ADMIN Only site administrators
OWNER Only object owner (creator) can see the object. Currently limited to deleting uploaded (RHS) compound sets.

For example: The /api/targets endpoint (row 14) allows anyone to see Public targets but only those who are members of the Target’s Proposal can modify them. Additionally this endpoint does not support deleting targets

mwinokan commented 2 months ago

Thanks @alanbchristie for updating the spreadsheet. The new columns show which permissions are required to perform operations on the open projects. Private projects as a general rule require authentication and correct permissions:

Access to non-public targets is generally restricted to those who are part of the corresponding session.

These permissions have already been extensively reviewed. @alanbchristie has open questions about a number of endpoints that do not check permissions/session membership for private projects: e.g. technically it's possible to create poses and snapshots for private projects (even if they can not be seen). @alanbchristie will add another column.

alanbchristie commented 2 months ago

A revises spreadsheet that also includes access permissions for private sessions. Areas of particular interest for private sessions are in RED cells in the Private Target Access Modes columns: -

The JobCallback does not provide access to data

fragalysis-api-security.xlsx

fragalysis-api-security.pdf

mwinokan commented 2 months ago

@alanbchristie please push the changes to staging

alanbchristie commented 1 month ago

The SessionProject endpoint authentication has been relaxed (now in the latest staging build).

fragalysis-api-security.xlsx

fragalysis-api-security.pdf

alanbchristie commented 1 month ago

Further relaxed endpoints include: -

fragalysis-api-security.xlsx

fragalysis-api-security.pdf