Open phraenquex opened 10 months ago
Changes can be made but with #1264 unresolved if we fix this fault (#1247) the we won't be able to upload any data.
I could introduce a local map of users and proposals but that's a "hack". Essentially #1264 is emerging as a blocking issue.
@alanbchristie @Waztom will this lock down https://fragalysis.diamond.ac.uk/api
as well? (Do we need it locked down?)
Not sure what you mean, https://fragalysis.diamond.ac.uk/api
is a built-in django/REST framework endpoint so we can't (easily) change its behaviour. Seeing the API is open to all. We can lockdown specific endpoints but they must be dealt with on a case-by-case basis?
...and this still has to be done.
@alanbchristie: Warren confirms this is a glaring data security breach waiting to happen - the backdoor not just unlocked, but propped wide open. We're only missing the 3-story neon signs.
@Waztom will spec up what's needed here.
@alanbchristie when I had a look - mistakenly thought I could see some private target data via https://fragalysis.diamond.ac.uk/api
. No need to panic, but we should still review, especially after the model changes to V2.
@Waztom it's staging stack you should be checking, isn't it?
@alanbchristie what protections are there against (a) looking at the code to work out how to use the API and (b) running queries for protected targets?
I guess if there's no way for a non-authorised person to know either target of target_access_string, then that's a somewhat robust protection for the time being; but it's not medium-term robust.
@Waztom or @mwinokan to check if all endpoints are authenticated. Is there a list of endpoints somewhere?
Waiting on @Waztom to get a lay of the land in #1414
@alanbchristie & @tdudgeon please start securing the endpoints that are easy to fix (regardless of importance), and document difficult endpoints for @phraenquex to review.
An initial estimate of API authentication complexity is attached in this PDF document: -
I think it's important to clarify the goal of this task.
I assume this is not about adding authentication to every endpoint (that would require everyone to login before any endpoint could be used) as that would alter the out-of-the-box behaviour of the app, especially for 'open' proposals. Instead I assume this is about making sure people cannot create or access material relating to 'closed' proposals (i.e. as declared in issue 1414).
Thanks @alanbchristie.
Correct: it's about making sure closed data is closed, only accessible with authentication.
(I'm not aware of any risks with keeping the endpoints open - @Waztom please confirm that this is your understanding too. If anybody does raise a risk, we'd have to crack open an entire new workstream - many new tickets, presumably.)
@matteoferla @Waztom please poke at those endpoints that Allan labelled as complex, and say if any of them create problems.
I predict https://fragalysis.xchem.diamond.ac.uk/api/siteobservation_tag/
needs to be fixed; I can't judge the others.
Having worked through a number of endpoints I have adjusted the original spreadsheet. This shows API endpoints that have been adjusted (those with comments in the Actions column).
I am especially nervous of wholesale changes like this because: -
Rather than merging this down - it might be worth testing a specific build, simply to verify that it hasn't fundamentally broken the application. Rolling the stuff back wil be more complicated than simply not using it in the first place.
Here's what I've been doing: -
IsObjectProposalMember
)user_is_member_of_any_given_proposals()
method)Here's the pattern adopted for point 3: -
Replace the view's ModelViewSet base class with the following: -
mixins.UpdateModelMixin,
mixins.CreateModelMixin,
mixins.DestroyModelMixin,
ISPyBSafeQuerySet,
You also need to adjust or set the string value of a filter_permissions
view member. This must provide a path to the object's project record. The project in the final object can be of type ForeignKey
or ManyToManyField
. Example: -
filter_permissions = "target__project_id"
And then add or replace the permission_classes
member of the view and set it to the following value: -
permission_classes = [IsObjectProposalMember]
The pattern for change (for a ReadOnlyModelViewSet) is to replace the view's base class with the following: -
ISPyBSafeQuerySet
You also need to adjust or set the string value of a filter_permissions
view member.
This must provide a path to the object's project record. The project in the final object can be of type ForeignKey or ManyToManyField. Example: -
filter_permissions = "target__project_id"
@alanbchristie says that some schema changes would simplify adding project-specific permissions to endpoints.
For multiple endpoints e.g. siteobservation_tag
where the route to the target is complicated, there may be a performance cost to obtaining the target unless it is added to the table.
@alanbchristie please work with @kaliif to review and make the schema changes before we upload the new targets for the green release #1456
Schema changes should be considered part of the green release, with less drastic changes being part of mint
We have made some progress but overlooked the 'create' path to objects. Consequently there is still work to be done here, specifically on a number of Serializers that can be used to trap un-authrorised creates. An new spreadsheet/pdf is attached that highlights the current security state of the endpoints (the "Secure (#1247)" column).
Where security is still an issue a "note", visible in the original spreadsheet (which is sadly MacOS-specific) explains any potentially blocking issue.
Fundamentally this is all attributed to a potential oversight and is largly now considered "technical Debt". Although the ISPyB security module was introduced at the project outset this was only ever used to prevent users seeing restricted content.
Over time, with developers changing some endpoints neglected to implement the same security and where 'write' security was added it was added to the crucial endpoints where new data was uploaded. Sadly an "unscrupulous" user could still access many endpoints and either create material or get it.
This effort should close all the exiting endpoints and introduce a pattern to avoid any occurrences.
With careful analysis of each endpoint, which we will re-review prior to commit ing the changes to the main branch, the risk of exposing secure material is extremely low.
Cluster Access Obviously anyone with access to the Kubernetes cluster (and therefore the database) will have access to all the material.
Cloud Access Also, we operate backups that are written to an NFS volume in STFC, so secure material could be obtained by an unscrupulous user who may have access to the volume.
@Waztom: @alanbchristie needs you for 30 minutes to go through the list.
The test code adds new environment variables to over-ride/force private proposals and membership of those proposals. A comma-separated list of proposals is provided in TEST_RESTRICTED_TAS
and a comma-separated list of usernames in TEST_RESTRICTED_USERS
.
Any proposals in TEST_RESTRICTED_TAS
are removed from any that would normally be considered public.
Any users in TEST_RESTRICTED_USERS
will have whatever proposals are returned by ISPyB plus any proposals in TEST_RESTRICTED_TAS
.
So, to turn the exiting public proposal lb18145-1
into a private proposal only achristie
can see/edit set the following in the Stack's StatefulSet: -
- name: TEST_RESTRICTED_TAS
value: lb18145-1
- name: TEST_RESTRICTED_USERS
value: achristie
After working with Warren we have made significant progress and are now down to one endpoint (I think) and a number of 'task query' endpoints. Latest spreadsheet Excel/PDF attached.
@alanbchristie's summary from the meeting:
This ticket was green because it might cause breaking DB changes, but @kaliif says there will be changes (new tables) but they will not be breaking and completed soon (day or two).
All the endpoints need to be tested again. @alanbchristie will create a new image and shout to @mwinokan and @Waztom for testing
2024-07-10 testing on Alan's stack
All the below were attempted (checked means it worked)
We're down to one 'task' endpoint, an endpoint that is extremely difficult to use and only poses a risk to exposing logs if the task ID (a UUID4) is shared or obtained somehow. An automated attack on the endpoint will have little or no effect.
Today's spreadsheet and PDF attached.
@alanbchristie recommends we do some testing with a faux private target. Alan says he can go into the database and modify the permissions to verify that @mwinokan can't perform unauthorised actions.
By using the aforementioned new environment variables (which must be removed prior to deploying to staging) we have confirmed that access is limited.
2024-07-12 testing:
Logged in as tfb64483
:
set main observation
I was unable to fully test the latest changes (which fix the 500 errors) as the CAS authentication is not working on the dev clusters. @alanbchristie says that staging is working so we might want to push the changes and test there.
@alanbchristie please merge up to staging so that we can test there
@alanbchristie to upload latest status, @phraenquex to update labelling to fully reflect non-risk.
The latest API state, with only /viewer/update_task
posing some minor risk.
img_from_smiles
API endpoint is not being called - this endpoint is called when RHS data is loaded for each computed set. The img_from_smiles
enpoint does work on Alan's stack, but is not being called by the frontend. Download as CSV
downloads file with html. Can see line - <li class=active><a href=/api/dicttocsv/>Dict To Csv List</a></li>
in html. @alanbchristie please use this endpoint to test if the RHS data has been successfully uploaded => https://fragalysis-alan-default.xchem-dev.diamond.ac.uk/api/compound-molecules/
The existing (staging) stack does not suffer from the compound-molecules
problem because its view has no ISPyB-based filtering, so we are able to see all the ComputedMolecule records.
In the security-fixed stack (Alan's) filtering is being used and it has highlighted a problem with the content of the model records for the upload ... the ComputedMolecule -> Compound does not have its project_id
(a ManyToManyField) populated. So filtering blocks all the ComputedMolecule records as there is no Project to find.
@alanbchristie will add another column to the endpoint spreadsheet so that we can finalise the R/W permissions spec for both open and closed projects
The following spreadsheet provides the supported Read/Create/Modify/Delete endpoint options for "PUBLIC" Proposals/Sessions. Access to non-public targets is generally restricted to those who are part of the corresponding session.
Public Target Access API Modes (for “Public” Proposals/Sessions)
The following is the key for the Read, Create, Modify and Delete columns
Mode | Restrictions |
---|---|
OPEN | No authentication is required. The user does not need to be logged in. |
AUTH | User needs to be logged in but is not required to be a member of the object’s Proposal/Session |
SESS | User must be logged in and must be a member of the object’s Proposal/Session |
X | This mode is not supported |
ADMIN | Only site administrators |
OWNER | Only object owner (creator) can see the object. Currently limited to deleting uploaded (RHS) compound sets. |
—
For example: The
/api/targets
endpoint (row 14) allows anyone to see Public targets but only those who are members of the Target’s Proposal can modify them. Additionally this endpoint does not support deleting targets
Thanks @alanbchristie for updating the spreadsheet. The new columns show which permissions are required to perform operations on the open projects. Private projects as a general rule require authentication and correct permissions:
Access to non-public targets is generally restricted to those who are part of the corresponding session.
These permissions have already been extensively reviewed. @alanbchristie has open questions about a number of endpoints that do not check permissions/session membership for private projects: e.g. technically it's possible to create poses and snapshots for private projects (even if they can not be seen). @alanbchristie will add another column.
A revises spreadsheet that also includes access permissions for private sessions. Areas of particular interest for private sessions are in RED cells in the Private Target Access Modes columns: -
The JobCallback does not provide access to data
@alanbchristie please push the changes to staging
The SessionProject endpoint authentication has been relaxed (now in the latest staging build).
Further relaxed endpoints include: -
Clean-up task