Open mwinokan opened 2 months ago
On closer look, this isn't as straightforward as I thought - canon sites and site observations are connected not with one-to-one but many to one. Using the example above - canon sites:
frag=> select id, name, canon_site_num, superseded, version from viewer_canonsite where name ilike 'A71EV2A-x0395+A+148';
id | name | canon_site_num | superseded | version
----+---------------------+----------------+------------+---------
56 | A71EV2A-x0395+A+148 | 3 | f | 1
(1 row)
Site observations for the same canon site:
frag=> select id, code, longcode from viewer_siteobservation where canon_site_conf_id in (select id from viewer_canonsiteconf where canon_site_id in (select id from viewer_canonsite where name ilike 'A71EV2A-x0395+A+148'));
id | code | longcode
-----+---------+---------------------------------------------
600 | Ax0497a | A71EV2A-x0497_A_147_1_A71EV2A-x0395+A+148+1
606 | Ax0515b | A71EV2A-x0515_A_147_1_A71EV2A-x0395+A+148+1
620 | Ax0269a | A71EV2A-x0269_A_147_1_A71EV2A-x0395+A+148+1
628 | Ax0152a | A71EV2A-x0152_A_201_1_A71EV2A-x0395+A+148+1
631 | Ax0194a | A71EV2A-x0194_A_147_1_A71EV2A-x0395+A+148+1
632 | Ax0202a | A71EV2A-x0202_A_147_1_A71EV2A-x0395+A+148+1
645 | Ax0341a | A71EV2A-x0341_A_147_1_A71EV2A-x0395+A+148+1
654 | Ax0375c | A71EV2A-x0375_A_147_1_A71EV2A-x0395+A+148+1
657 | Ax0395a | A71EV2A-x0395_A_147_1_A71EV2A-x0395+A+148+1
658 | Ax0395b | A71EV2A-x0395_A_148_1_A71EV2A-x0395+A+148+1
673 | Ax0831a | A71EV2A-x0831_A_147_1_A71EV2A-x0395+A+148+1
679 | Ax0875b | A71EV2A-x0875_A_147_1_A71EV2A-x0395+A+148+1
690 | Ax1105a | A71EV2A-x1105_A_147_1_A71EV2A-x0395+A+148+1
691 | Ax1105b | A71EV2A-x1105_A_148_1_A71EV2A-x0395+A+148+1
692 | Ax1109a | A71EV2A-x1109_A_147_1_A71EV2A-x0395+A+148+1
695 | Ax1145a | A71EV2A-x1145_A_201_1_A71EV2A-x0395+A+148+1
699 | Ax1148b | A71EV2A-x1148_A_147_1_A71EV2A-x0395+A+148+1
707 | Ax1293a | A71EV2A-x1293_A_151_1_A71EV2A-x0395+A+148+1
(18 rows)
frag=>
@phraenquex is there a way to select which shortcode to use?
Thanks @kaliif, I'm not sure there will be a reliable way to select an observation. My alternative suggestion:
Can we shorten the Canonical Site name as follows:
A71EV2A-x0395+A+148
to Ax0395+A+148
essentially replacing {target_name}-
with {short_code_prefix}
(the same prefix that's used to generate observation shortcodes)?
I think that's in line with what @phraenquex wanted, i.e. remove redundant (target) information from the default tag names
@mwinokan the code prefix is defined separately for each experiment and while it is usually the same (I don't think there's an example yet where they're not), it's not guaranteed to be. What's the protocol for when I have multiple prefixes across a single canon site?
@kaliif: @ConorFWild says: each canonical site has an associated reference observation, which is a unique association. You use the shortcode of THAT observation for this tag.
Conor says it is (should be) in canonical_sites.yaml
.
Thanks @mwinokan. That takes care of canon sites and conformed sites. That leaves crystalform sites, they are also using the target name in tag but they don't have reference site (db schema). What are your thoughts on this, which prefix to use?
@kaliif CrystalformSites look ok (no target name)
@mwinokan interesting, I'm seeing them like this:
frag=# select id, upload_name from viewer_siteobservationtag where category_id = 3;
id | upload_name
----+--------------------------------
29 | F1a - CHIKV_MacB-x0270/A/304/1
30 | F1b - CHIKV_MacB-x0692/D/304/1
31 | F1c - CHIKV_MacB-x1123/C/401/1
32 | F1d - CHIKV_MacB-x0756/C/401/1
33 | F1e - CHIKV_MacB-x1123/B/310/1
34 | F1f - CHIKV_MacB-x1132/D/307/1
35 | F1g - CHIKV_MacB-x1134/B/308/1
36 | F1h - CHIKV_MacB-x0294/A/501/1
37 | F1i - CHIKV_MacB-x1125/A/402/1
38 | F1j - CHIKV_MacB-x1125/C/401/1
39 | F1k - CHIKV_MacB-x1134/A/304/1
40 | F1l - CHIKV_MacB-x1134/B/309/1
41 | F1m - CHIKV_MacB-x1316/C/304/1
But if you say they're fine, I'm happy to merge this to staging
Please merge it @kaliif and then I can verify
@mwinokan to verify with a new upload on Kalev's stack (together with #1492)
@kaliif the xtalform_sites.yaml can be used to map CrystalFormSites to observations:
The keys in xtalform_sites.yaml are in the form e.g. CHIKV_MacB-x0294/A/501/1
where the slashes need to be replaced with plusses to map to the main observation associated with the CrystalFormsSite
@kaliif please preserve three versions of the XCA generated tags in the backend:
@kaliif the xtalform_sites.yaml can be used to map CrystalFormSites to observations:
The keys in xtalform_sites.yaml are in the form e.g.
CHIKV_MacB-x0294/A/501/1
where the slashes need to be replaced with plusses to map to the main observation associated with the CrystalFormsSite
@ConorFWild @mwinokan what do I do in case the crystalform site key (slashes replaced) is not found in any observation key? For example in CHIKV version 1 upload these would be
CHIKV_MacB-x0270/A/304/1
CHIKV_MacB-x0756/C/401/1
CHIKV_MacB-x1123/C/401/1
I see that all crystalform sites have foreign key to canon site, would it be OK to follow this relation instead i.e. crsytalform site -> canon site -> ref site observation?
@kaliif yes that seems sensible
The shortened tags are now in db, but they're not being pulled to metadata.csv yet. I made significant changes to metadata handling in the bulk tag issue and I'd prefer to wait until this is merged, otherwise I'm going to have a conflict
E.g. the loader names the CanonSite by default as:
A71EV2A-x0395+A+148+1
But this could be simplified by using the observation shortcode:
Ax0395a+148+1
Please @phraenquex double check that this is the correct new syntax
@kaliif says it is a simple fix