Open mwinokan opened 2 months ago
tl;dr - Fragalysis works and will keep working when the unique constraint is dropped from target.title
field but it will break immediately when somebody tries to upload a duplicate
The title
field has until now been the main method to fetch a concrete target object; when the uniqueness is not guaranteed, a different method is needed. Logical substitution would be a target-proposal combination but with the current database setup, this won't work either because of the many-to-many relationship between target and project (proposal) tables. To enable this, target would need a foreign key to proposal (there is actually a ticket about simplifying their relationship, #1488), and then querying with both target name and proposal string will give you a unique object. Code changes required:
security.py
modules, where the m2m relationship is explicitly expected (because of the way it's set up, I expect the majority of the calls to security.py
method to remain unaffected)It will not be possible to upload duplicate targets to different projects by simply relaxing the uniqueness constraint, it will require the corresponding code changes. And this will require changes to the backend schema. However, these changes would be relatively small, if this change were to be introduced later, when the database already contains some targets, the necessary backend changes can be managed by a relatively simple data migration.
Thanks @kaliif that is terrifically useful.
@mwinokan then we can go ahead with green, as agreed. And this work needs to go up very in the mint release.
@phraenquex I found this comment in the code:
# For Diamond, a proposal is always required.
# For other implementations, it may be optional or omitted.
Is this still relevant and something the code should handle? In some deployment settings there are no proposals?
Copying @alanbchristie to confirm the below.
@kaliif: I'm guessing that comment is older than the final spec, else it would have said "target_access_id", rather than "proposal".
So now the comment would need to read: _"a target_accessid" is always required; for Diamond, that will be the proposal/visit string."
Some of the changes made require corresponding changes in the FE. @boriskovar-m2ms or @matej-vavrek who's going to pick it up, but some (not all I think) of the API calls that have an attribute project_id
should now be simply project
. Image here: kaliif/fragalysis-backend:latest
@matej-vavrek to please pick up the f/e work for this. This will require several changes as multiple endpoints will not work with the current frontend
@kaliif, F/E can be found on #1514-project branch or as built docker image zemiacsik/fragalysis-stack:matej.1514.240925.1
/api/targets/
had before project_id
as array, project
seems to be only number (or also array?). What can we expect?
@matej-vavrek yes, it's going to be a foreign key to project now, not many2many
@kaliif is loading the f/e changes to his stack and will check compatibility with the b/e
@matej-vavrek at least one more change is required in the FE - the LHS download needs to fill the target_access_string
attribute (currently active proposal name) in the request.
I don't know of any others at the moment, but if there are more places where the FE sends a query to target using just the title
attribute, they all need to add proposal name as well.
@kaliif has loaded @matej-vavrek changes to his stack (kalev stack) and they need a thorough test
@kaliif says this needs to be tested with multiple targets so that edge cases can hopefully be tested
@mwinokan to test this when he has bandwidth (next week)
EVA712A_011024.tgz
under lb32627-66
OKEVA712A_011024.tgz
under lb32627-71
OKD68EV3C_011024.tgz
under lb32627-66
OKAll three uploads look ok to me on Kalev's stack. But I could use some help testing all the buttons
@kaliif do you think the above is sufficient for testing?
@mwinokan could you also test RHS upload and LHS download and re-upload of metadata. These will tell you if the frontend-backend communication works. The 1535 bug is still active there so the data will get scrambled but it will tell you if the procedure as such works.
@kaliif
I tried to upload small test RHS set A71EV2A_lhs2rhs_20241010.sdf but it is stuck as PENDING.
Then I realised that the endpoint upload_cset
does not include "Target access string" field so there is no way for the backend to know where to register the compounds, I suspect this is preventing the task from completing.
@mwinokan the version in my stack doesn't include the fix to the multiple targets issue, so TAS isn't necessary. Pending state seems more like stack health issue, possibly about connecting Redis. I'll restart the stack.
@kaliif ok so I should now be testing on staging?
@mwinokan it is merged, so yeah, you can test in staging. But I also refreshed my stack if you want to continue there.
EVA712A_011024.tgz
under lb32627-66
EVA712A_011024.tgz
under lb32627-71
D68EV3C_011024.tgz
under lb32627-66
EVA712A_011024.tgz
under lb32627-66
, failed correctly@kaliif I tested metadata and RHS uploads on your stack, all seems ok. See above
@mwinokan, @kaliif, is this also in "xchem/fragalysis-stack:latest" already and should I merge changes for F/E too? Because my latest build on my tack is crashing on undefined target.project_id
.
@kaliif @matej-vavrek what is the status of the changes, can they be merged to staging? For sure some bits are missing because the staging RHS upload endpoint is missing the TAS field.
@mwinokan, I am not sure about B/E, but it seems that this is already part of "xchem/fragalysis-stack:latest" (since F/E crashes with latest version of stack), although F/E modifications related to this task are not part of general/staging branch at the moment (they can be merged if B/E is in this ready state now).
@kaliif is not sure why the upload_cset endpoint is out of date
@boriskovar-m2ms points out that the last few staging deployments have failed
@alanbchristie is redeploying staging
@matej-vavrek please merge the f/e changes associated with this ticket to staging
Approved in staging, more work is needed to allow for separate RHS uploads for each version of the target, see https://github.com/m2ms/fragalysis-frontend/issues/1552
@kaliif says that the target table enforces a unique name for each upload. The relationship between the target and proposal is many-to-many so there is no quick foreign key constraint that can include the name and the proposal.
@alanbchristie suggests removing the unique constraint in the target table but to check in the upload code whether the given target access string already has a target with the name being uploaded.
@phraenquex stresses that this an important feature to share different subsets of target data with different audiences
@kaliif says that the target title is being used to fetch the target, so a uniqueness is required
@mwinokan also points out that the feedback loop to uploading a target with a duplicated name and proposal string is very long as a large tarball must be POSTed and then decompressed before seeing the error. Perhaps we should explore not reading the target name from the config.yaml, but instead add it to the POST payload.
@kaliif please investigate if this is a breaking change, i.e. more than just removing the unique constraint