Open phraenquex opened 1 week ago
Replacing 'x' is as easy as it sounds, but in downloads, I'm currently relying on it to get the numbers out of the shortcode and fetch the file from storage. If I have certainty about the name format (i.e. is it always {str prefix}{four numbers}{str suffix}?), then I could use this rule. I also may have to store the prefix in the experiment model which would mean adding a field to db.
@phraenquex suggests
CHIKV_MacB-x0281:
type: model_building
last_updated: '2024-07-01 08:04:00'
refinement_outcome: 6 - Deposited
compound_code: Z1041785508
code_prefix: c
crystallographic_files:
xtal_pdb: {file: upload_1/crystallographic_files/CHIKV_MacB-x0281/CHIKV_MacB-x0281.pdb,
sha256: f75ef02d98c108693e85439c82d45d382bfacbb6aaea56496d9e01d85343b81b}
xtal_mtz: {file: upload_1/crystallographic_files/CHIKV_MacB-x0281/CHIKV_MacB-x0281.mtz,
sha256: 3a02d732372977ff1e4e688227c23b7aff2f0116f652235d0d61d9b837eb6003}
ligand_cif: {file: upload_1/crystallographic_files/CHIKV_MacB-x0281/CHIKV_MacB-x0281.cif,
sha256: 0ea8b733f7c19a354063368a3bb09858ac354f0493b822ba7b37f4604f8f39b5,
smiles: 'c1csc(-c2nnc[nH]2)n1', ligand_name: LIG}
ligand_binding_events:
- {file: upload_1/crystallographic_files/CHIKV_MacB-x0281/CHIKV_MacB-x0281_1_C_304.ccp4,
sha256: 935eb4e2c845441bf0b3ed3f670607031e339f4c8163083157b7544fa2e7d46a,
model: '1', chain: C, res: 304, index: 1, bdc: 0.09}
status: new
assigned_xtalform: P31-4fold-NCS
aligned_files:
C:
'304':
'1':
CHIKV_MacB-x0300+A+401+1:
the above (from meta_aligner.yaml) should produce (note the 'v' for version):
CHIKV_MacB-x0281_C_304_v1
and not CHIKV_MacB-x0281_C_304_1_CHIKV_MacB-x0300+A+401+1
_v1
format.The _v1 thing might be a bigger change, because the long code is currently used as a key internally in the loader.
Kalef to investigate if this can be a smallish change (e.g. change the string only at the very end of loader) - if not, leave without the _v.
_v1 turned out not to be a problem, implemented along with other code format updates.
I just (finally) realised that the shortcode is using the prefix wrong.
Currently, you it prepends the prefix, i.e. prefix=
A
andSOMETHING-x0235
becomeAx0235a
.What it should do is replace the
x
: it should beA0235a
.@kaliif is this as simple a fix as it appears to me?
At this point, it is appropriate to assume the Diamond naming convention, that the string will always end in
(dash)(char)(num)(num)(num)...
(Still out of scope for this conversation is, that for non-Diamond data, we need a more general mechanism for specifying which part of the longcode is used as as the numeric equivalent.)
cc @mwinokan for info.
Labeling as green release, because it's part of the headline goal of transforming the arcane crystallographic artefacts into a common reference frame and naming convention.