m2ms / fragalysis-frontend

The React, Redux frontend built by webpack
Other
1 stars 1 forks source link

Fixes to shortcode and longcode naming. #1499

Open phraenquex opened 1 week ago

phraenquex commented 1 week ago

I just (finally) realised that the shortcode is using the prefix wrong.

Currently, you it prepends the prefix, i.e. prefix=A and SOMETHING-x0235 become Ax0235a.

What it should do is replace the x: it should be A0235a.

@kaliif is this as simple a fix as it appears to me?

At this point, it is appropriate to assume the Diamond naming convention, that the string will always end in (dash)(char)(num)(num)(num)...

(Still out of scope for this conversation is, that for non-Diamond data, we need a more general mechanism for specifying which part of the longcode is used as as the numeric equivalent.)

cc @mwinokan for info.

Labeling as green release, because it's part of the headline goal of transforming the arcane crystallographic artefacts into a common reference frame and naming convention.

kaliif commented 1 week ago

Replacing 'x' is as easy as it sounds, but in downloads, I'm currently relying on it to get the numbers out of the shortcode and fetch the file from storage. If I have certainty about the name format (i.e. is it always {str prefix}{four numbers}{str suffix}?), then I could use this rule. I also may have to store the prefix in the experiment model which would mean adding a field to db.

mwinokan commented 1 week ago

@phraenquex suggests

CHIKV_MacB-x0281:
    type: model_building
    last_updated: '2024-07-01 08:04:00'
    refinement_outcome: 6 - Deposited
    compound_code: Z1041785508
    code_prefix: c
    crystallographic_files:
      xtal_pdb: {file: upload_1/crystallographic_files/CHIKV_MacB-x0281/CHIKV_MacB-x0281.pdb,
        sha256: f75ef02d98c108693e85439c82d45d382bfacbb6aaea56496d9e01d85343b81b}
      xtal_mtz: {file: upload_1/crystallographic_files/CHIKV_MacB-x0281/CHIKV_MacB-x0281.mtz,
        sha256: 3a02d732372977ff1e4e688227c23b7aff2f0116f652235d0d61d9b837eb6003}
      ligand_cif: {file: upload_1/crystallographic_files/CHIKV_MacB-x0281/CHIKV_MacB-x0281.cif,
        sha256: 0ea8b733f7c19a354063368a3bb09858ac354f0493b822ba7b37f4604f8f39b5,
        smiles: 'c1csc(-c2nnc[nH]2)n1', ligand_name: LIG}
      ligand_binding_events:
      - {file: upload_1/crystallographic_files/CHIKV_MacB-x0281/CHIKV_MacB-x0281_1_C_304.ccp4,
        sha256: 935eb4e2c845441bf0b3ed3f670607031e339f4c8163083157b7544fa2e7d46a,
        model: '1', chain: C, res: 304, index: 1, bdc: 0.09}
    status: new
    assigned_xtalform: P31-4fold-NCS
    aligned_files:
      C:
        '304':
          '1':
            CHIKV_MacB-x0300+A+401+1:

the above (from meta_aligner.yaml) should produce (note the 'v' for version):

CHIKV_MacB-x0281_C_304_v1 and not CHIKV_MacB-x0281_C_304_1_CHIKV_MacB-x0300+A+401+1

phraenquex commented 1 week ago

The _v1 thing might be a bigger change, because the long code is currently used as a key internally in the loader.

Kalef to investigate if this can be a smallish change (e.g. change the string only at the very end of loader) - if not, leave without the _v.

kaliif commented 1 week ago

_v1 turned out not to be a problem, implemented along with other code format updates.

mwinokan commented 1 week ago

Shortcode and longcode changes look good in staging