Open turbomam opened 2 years ago
Some string serializations are really just lists and could be re-implemented as enumerations
slot | string_serialization |
---|---|
aero_struc | [plane|glider] |
built_struc_set | [urban|rural] |
ceil_struc | [wood frame|concrete] |
contam_screen_input | [reads| contigs] |
detec_type | [independent sequence (UViG)|provirus (UpViG)] |
fireplace_type | [gas burning|wood burning] |
heat_sys_deliv_meth | [conductive|radiant] |
host_dependence | [facultative|obligate] |
seq_quality_check | [none|manually edited] |
shading_device_loc | [exterior|interior] |
space_typ_state | [typically occupied|typically unoccupied] |
sym_life_cycle_type | [complex life cycle | simple life cycle] |
urine_collect_meth | [clean catch|catheter] |
wga_amp_appr | [pcr based|mda based] |
window_status | [closed|open] |
string_ser | counts |
---|---|
{text} | 239 |
{float} | 90 |
{unit} | 86 |
{[termID]} | 75 |
{termLabel} | 74 |
{URL} | 35 |
{PMID} | 34 |
{DOI} | 34 |
{integer} | 30 |
{Rn/start_time/end_time/duration} | 26 |
{boolean} | 15 |
{version} | 13 |
{software} | 11 |
{duration} | 11 |
{parameters} | 9 |
{term} | 8 |
{timestamp} | 7 |
{dna} | 7 |
{PMID|DOI|URL} | 3 |
{period} | 2 |
{term label} | 2 |
{NCBI taxid} | 2 |
{rank name} | 2 |
{database} | 2 |
{clustering method} | 1 |
{AF cutoff} | 1 |
{ANI cutoff} | 1 |
{PID} | 1 |
{{text} | 1 |
{day} | 1 |
{term ID} | 1 |
{measurement value} | 1 |
{percentage} | 1 |
{reference} | 1 |
{interval} | 1 |
{has numeric value} | 1 |
{has unit} | 1 |
Also not including whitespace
count | char | notes |
---|---|---|
2 | _ | separates words in a token's name |
1 | [ | literal used with term IDs, like mountain [ENVO:12345678] |
1 | ] | literal used with term IDs, like mountain [ENVO:12345678] |
38 | { | wraps token. Also, see sieving below |
37 | } | wraps token |
3 | / | delimits sub-tokens in {Rn/start_time/end_time/duration} |
2 | | | delimits alternative tokens for literature references |
{{text}|{float} {unit}};{float} {unit}`
{PMID|DOI|URL}
plant_part_maturity
geo_loc_name
, pos_cont_type
, host_of_host_pheno
microb_start
, plant_part_maturity
Is this all standardized now or is there outstanding work visa vis MIxS or LinkML?
Good quesiton, @ddooley . @turbomam , could you provide an update?
a string serialization of '{float} {unit}' implies that there are
float
andunit
classesSee also LinkML issue https://github.com/linkml/linkml/issues/674
Switch to LinkML structured patterns
See also