Open phraenquex opened 10 months ago
I could use some input on this, I simply don't know enough of these file types. This is what the updated description says now:
### aligned_files directory The aligned directory contains a subdirectory for each ligand that was selected for downloading. #### Contents of aligned_files subdirectory Depending on your selection of options when downloading the data the follow file suffixes may be present - [site observation code].pdb --- protein model without ligand bound - [site observation code]\_apo.pdb - protein model with ligand bound - [site observation code]\_event.ccp) - Event Electron density cut to around 12 Angstrom around the ligand. This has a higher signal-to-noise ratio which will amplify the evidence of ligand occupancy - [site observation code]\_sigmaa.ccp4 - estimate of the true electron density from diffraction data and atomic model. Cut to around 12 Angstrom around the ligand. - [site observation code]\_diff.ccp4 - difference electron density map, negative density typically represents where no electron density is found but exists in the atom model. Positive densities represent electron density without mapped atom model. Cut to around 12 Angstrom around the ligand. ### crystallographic_files directory The crystallographic folder contains the unprocessed versions of all data found in the aligned folder. As one crystal can have mutliple ligands we provide the input crystallographic files once to avoid redundancy and keep download sizes to a minimum. #### Contents of crystallographic_files subdirectory Depending on your selection of options when downloading the data the follow file suffixes may be present: - [site observation code].cif - [site observation code].mtz Reflection data corresponding to pdb file. - [site observation code].mtz Event Backgroud corrected reflection data corresponding to pdb file. - [site observation code]\_[chain/ligand].ccp4 - estimate of the true electron density from diffraction data and atomic model.
In aligned_files section, there used to be a pdb with _bound.pdb
suffix, the field is still called bound_file
in the database, but it's now populated with a pdb file without _bound
suffix.
I'm also confused about the handling of .sdf
files. In v1, there were 2 options, if the Molecule model had a reference to sdf file, it was added under aligned_files, if not, and the sd file contents were stored in the database field as text, it went to missing_sdfs
directory. If you try to download now, you'll see that all the site observation's sdfs are going to missing_sdfs
; that's because now, a reference to sdf is not stored and there's only a text field with file contents. Since this is how it was set up in v1 and has not been changed, I can only conclude this is the desired behaviour?
@kaliif how can we edit the text of the PDF?
@tdudgeon @ConorFWild need to pin down the precise content of _apo and _desolv etc.
Documentation template: https://github.com/xchem/fragalysis-backend/blob/staging/README.md
Include snapshot link as per #1175
@mwinokan to take a look with Jenke
@mwinokan spoke to Jenke, he is still most concerned with the SEQRES headers (#1149) and linking ASAP ID (possibly #1262)
ASAP ID's will be implemented in #1262 and SEQRES in #1149
Add note about which map files to load into coot (the "crystallographic" ones)
@kaliif please point us to the relevant file & repo to edit.
Things to fix.
yaml
filesextra_files
(esp. metadata.csv
)
Mostly check that the PDF actually documents the downloaded zip file. General sanity check.
Also, see #785 for additional spec.
Also, see #786 for frontend ticket.