Open phraenquex opened 3 years ago
Tyler and I had a meeting on Thursday on this. Attached is an analysis with a solution outline and some estimates: https://docs.google.com/document/d/1T5RV4TzzwShdR5wNMXe6Nx2gdS-HaVLE4EXxCufnKzw/edit?usp=sharing
For "deleting" structures, 3 main actions:
The solution design document has been updated with the delete processing: https://docs.google.com/document/d/1T5RV4TzzwShdR5wNMXe6Nx2gdS-HaVLE4EXxCufnKzw/edit?usp=sharing
@duncanpeacock - if you haven't yet, also spec the mechanism for communicating errors back to the uploader.
A specific error: dataset ID not unique. (That's the "X0001" or "P0001" number.)
From #673
Currently the upload process only stores files from the aligned directory in the database.. The download process as designed currently only picks files from these fields - following the design decision to keep the process as simple as possible.
This will have to be modified to properly store the files from the crystallographic folder in the database as part of the target upload process. Then this could be access as part of the download in a similar way to the current aligned files. Unlike the aligned folder, we want to make this flexible so that new file types can be uploaded in the target loader without code changes.
Crystallographic mapping:
Aligned Crystallographic Mpro-x0072_0A - Mpro-x0072 Mpro-x0072_1A - Mpro-x0072 Mpro-x0104_0A - Mpro-x0104
So the files in the Crystallographic folder can be accessed using the base Crystal name (stem of the protein code without the _0A, _1A etc)
We would add a new Crystallographic table with an array of links to an associated files table that contains a mapping to indicate the file template to identify the file in the crystallographic folder.
Name of Crystal, Many2Many (id, Target, File (FileField), FileTypeMapping)
File-FileTypeMapping Mpro-x0072, Mpro-x0072.pdb, PDB
When the target loader runs, it will load the Crystallographic and files tables. If a protein code is marked as changed in the metadata then both the associated aligned AND crystallographic files are updated
In the download structures window, when the crystallographic structures are selected, the API will extract the crystal name from the protein codes provided and supply the desired files from the database - adding them to the crystallographic folder in a sub-directory labelled with the source protein (e.g. Mpro-x0072).
The new fields will be:
And one other point:
If so I can add this to the design document.
This epic should include versioning, and that's part of the schema update/redesign.
For versioning - brainstorm by @phraenquex, @tdudgeon, Daren
target_access_string
) can upload or release
Initial high level spec for the new loaders: https://docs.google.com/document/d/1osK1mbaO5TrNRY8-0P5piiYodaEA_z_sgU_7SmjfzHA/edit#
Work remaining for epic: 1. Complete XChemAlign #999 (May 20?)
Things @tdudgeon has worried about:
Place-holders already in database, should hold and serve the files, but needn't digest it. Files should be in media dir, not in database (that would be a longterm risk)
Two options:
Parser of Align Output should assess whether new and meaningfully different from the old alignments. If not, toss the new one. It's Tim's side of the code that must do this.
We won't try and have them co-exist; we'll have to re-upload existing data.
We'll need to think of a mechanism to transfer tags so they stay attached to the same compounds.
Do align them onto each site, whether or not they have ligands bound.
@tdudgeon and Daren to settle on the convention.
Might be available in SoakDB already - @phraenquex had previous discussed with Daren adding an extra column
No - FE/BE API will use upload datestamps to allow FE to present it properly.
Ticket for frontend work: #540
This ticket currently covers mainly backend work (@duncanpeacock ), including: Things to fix