pombase / canto

The PomBase community curation tool
https://curation.pombase.org
Other
19 stars 7 forks source link

proposed field name change "short name" #2071

Open ValWood opened 5 years ago

ValWood commented 5 years ago

Can we change the text in the genotype (alternative) "Name" field from "short name" to "alternative name"

short name is very restrictive,

mah11 commented 5 years ago

Strictly speaking, this is a duplicate of #1321, but we could close the older one because the discussion there went on for ages and round in circles.

Anyway, the field in question is not named "short name". That's just the help text inside the box. The simplest thing would be simply to remove the word "short" from that help text. The field label is "Name:".

I do not like adding "alternative" to the field label at all, because there is nothing alternative about it. It's optional; if it's blank a genotype has no name, and that's fine. I don't think we have any place for synonyms for genotypes (we do for genes or alleles).

ValWood commented 5 years ago

I agree that we should remove "short", and close th other ticket if it is not useful.

but I'm still confused what this field is for.

If people have a used a "specific ~genotype~ allele name", then I would use this name to decribe the ~genotype~ al.lele when I create it. I still don't understand when I would use this additional field?

ValWood commented 5 years ago

I just re-read the ticket and i still don't get what this field was created for. We should talk about this on the next call. it won't be Thursday as I am teaching. Midori is away the following week so we can discuss this the week after...

kimrutherford commented 5 years ago

If people have a used a "specific genotype name", then I would use this name to decribe the genotype when I create it. I still don't understand when I would use this additional field?

I not following. Genotypes have a name, a background and a comment field. Which additional field are you talking about?

ValWood commented 5 years ago

the "name " field, which has "short genotype name"

I do not understand why we would use this. I can see that you might want to add a name to a collection of alles in amuti allele genotype, but why would you want to add an (additional) name to a single allele genotype? I just can't think of a use case, I guess. If we have abundant use cases that's fine but otherwise I find it really confusing what I would put in here...

ValWood commented 5 years ago

These were the results of the query.

Could you rerun, because now the diploids are fixed we can remove some of these as they are unnecessary (this was a stop-gap we were subverting the field)

My point is that we probably don't need this field, but we do need the ability to add an allele synonym. Most of these name s look like allele of genotype synonyms.

Some are probably "genotype names" and should override the names assigned automagically. Some look like background info.

Some seem fine like https://www.pombase.org/genotype/aap1delta__fma2delta__isp6delta__oma1delta__ppp16delta__psp3delta__sxa2delta

              name                   |      value       |  uniquename   

------------------------------------------+------------------+--------------- 10GalTdelta | 0e85b84df580612d | PMID:22988247 10GalTdelta och1delta | 0e85b84df580612d | PMID:22988247 1B3B | 74dade3ace393390 | PMID:26687354 3TLSdelta | 2874cbc5bcdadd84 | PMID:17277362 3TLSdelta B | 2874cbc5bcdadd84 | PMID:17277362 5DUB | 7881e95072db6f1d | PMID:26412298 7GalTdelta | 0e85b84df580612d | PMID:22988247 7GalTdelta | a9159ec6909bd6c3 | PMID:21098516 7GalTdelta och1delta | 0e85b84df580612d | PMID:22988247 A2 | 72fdbcb425faf443 | PMID:16802154 A3 | 72fdbcb425faf443 | PMID:16802154 A4-1 | 72fdbcb425faf443 | PMID:16802154 A4-2 | 72fdbcb425faf443 | PMID:16802154 A4-3 | 72fdbcb425faf443 | PMID:16802154 A5 | 72fdbcb425faf443 | PMID:16802154 A6 | 72fdbcb425faf443 | PMID:16802154 A7-1 | 72fdbcb425faf443 | PMID:16802154 A7-2 | 72fdbcb425faf443 | PMID:16802154 A7-3 | 72fdbcb425faf443 | PMID:16802154 A8 | 433b4930c599dc93 | PMID:19669754 A8 | 71662226413c7ee0 | PMID:21153812 A8-vps10 | 433b4930c599dc93 | PMID:19669754 atb2-996/atb2+ heterozygous diploid | c4669f01934c07d6 | PMID:9658169 ccq1∆ | 72c1277f6497035f | PMID:29422503 cdc2-130/cdc2-130 homozygous diploid | 91da8e4c4e7b6d48 | PMID:7262540 cdc2-1w/cdc2-1w homozygous diploid | 91da8e4c4e7b6d48 | PMID:7262540 cdc2-56/cdc2-56 homozygous diploid | 91da8e4c4e7b6d48 | PMID:7262540 cdc2+/cdc2-130 heterozygous diploid | 91da8e4c4e7b6d48 | PMID:7262540 cdc2+/cdc2-1w heterozygous diploid | 91da8e4c4e7b6d48 | PMID:7262540 cdc2+/cdc2-33 heterozygous diploid | 91da8e4c4e7b6d48 | PMID:7262540 cdc2+/cdc2-56 heterozygous diploid | 91da8e4c4e7b6d48 | PMID:7262540 cdc2+/cdc2-L7 heterozygous diploid | 91da8e4c4e7b6d48 | PMID:7262540 cdc2+/cdc2-M26 heterozygous diploid | 91da8e4c4e7b6d48 | PMID:7262540 cdc2+/cdc2-M35 heterozygous diploid | 91da8e4c4e7b6d48 | PMID:7262540 cdc2+/cdc2-M55 heterozygous diploid | 91da8e4c4e7b6d48 | PMID:7262540 cdc2+/cdc2-M63 heterozygous diploid | 91da8e4c4e7b6d48 | PMID:7262540 cdc2+/cdc2-M72 heterozygous diploid | 91da8e4c4e7b6d48 | PMID:7262540 cdc2+/cdc2-M76 heterozygous diploid | 91da8e4c4e7b6d48 | PMID:7262540 cdc3-1241cdc3-124sop2-1/sop2+ diploid | 7a6602af54fdea54 | PMID:8978670 cdc3-6 | 2598fac4f5efa38a | PMID:958201 cdc3-6 | 4a8e1e2dd8c6d041 | PMID:6490749 cdc3-6 | 5c6727a02c994591 | PMID:22891259 CK1-so | ed7f95ec599f51aa | PMID:25579976 CK1-so, sgo1-delta | ed7f95ec599f51aa | PMID:25579976 cs-cs | 6e175411383ea731 | PMID:3040264 csn1delta htAQ | d81e6d582d5b3b8f | PMID:25795664 ctp1delta rad3delta H2A-AQ | 2ef9911b1699da5e | PMID:21098122 Cut2N73ddm | e01b2f3ba979a8d0 | PMID:9312055 Cut2N73dm1 | e01b2f3ba979a8d0 | PMID:9312055 Cut2N73dm2 | e01b2f3ba979a8d0 | PMID:9312055 deltaBqt1/2-site | cafc26135a141916 | PMID:23133674 deltaCCPC | 9f25f26d17f2ec78 | PMID:25891897 deltaCCPR | 9f25f26d17f2ec78 | PMID:25891897 deltaCCPRC | 9f25f26d17f2ec78 | PMID:25891897 deltaPoz1-site | cafc26135a141916 | PMID:23133674 dis3-54 (P509L) | c757e60bdb79c5cb | PMID:26670050 DKO | 38a7316338b92681 | PMID:12963726 est1delta::kanMX6 | 72c1277f6497035f | PMID:29422503 F15 tsc1-delta | 2697c460cee0ce6a | PMID:16115814 F15 tsc1-delta pas1-OP | 2697c460cee0ce6a | PMID:16115814 F15 tsc2-delta | 2697c460cee0ce6a | PMID:16115814 F15 tsc2-delta pas1-OP | 2697c460cee0ce6a | PMID:16115814 fas1-AT | 1bc92c0f892db627 | PMID:26869222 fas1-AV | 1bc92c0f892db627 | PMID:26869222 fhl1 delta | 4847e0de3cb01075 | PMID:27165118 GA2 | 67026591584779dd | PMID:27392239 git2-7 git2-61 heterozygous diploid | beb013202ef91cd2 | PMID:2157626 H2A-AQ | 2ef9911b1699da5e | PMID:21098122 h2a-R18A | 7d2c05ef7277f909 | PMID:21633354 h2a-R18A,h2a.z-so | 7d2c05ef7277f909 | PMID:21633354 h2a.z-so | 7d2c05ef7277f909 | PMID:21633354 H4.2 K8A K16G | 2c03fd218529dd82 | PMID:24478943 heterozygous diploid | 5964257cca6cfa69 | PMID:7898433 heterozygous diploid cdc13delta/cdc13+ | c7dfdbd8467f6d60 | PMID:27736299 heterozygous diploid cdc25delta/cdc25+ | c7dfdbd8467f6d60 | PMID:27736299 heterozygous diploid cdc2delta/cdc21+ | c7dfdbd8467f6d60 | PMID:27736299 heterozygous diploid cdr1delta/cdr1+ | c7dfdbd8467f6d60 | PMID:27736299 heterozygous diploid cpc2delta/cpc2+ | c7dfdbd8467f6d60 | PMID:27736299 heterozygous diploid dea2delta/dea2+ | c7dfdbd8467f6d60 | PMID:27736299 heterozygous diploid nsp1delta/nsp1+ | c7dfdbd8467f6d60 | PMID:27736299 heterozygous diploid nup184delta/nup184+ | c7dfdbd8467f6d60 | PMID:27736299 heterozygous diploid nup186delta/nup186+ | c7dfdbd8467f6d60 | PMID:27736299 heterozygous diploid nup189delta/nup189+ | c7dfdbd8467f6d60 | PMID:27736299 heterozygous diploid nup45delta/nup45+ | c7dfdbd8467f6d60 | PMID:27736299 heterozygous diploid nup97delta/nup97+ | c7dfdbd8467f6d60 | PMID:27736299 heterozygous diploid pom1delta/pom1+ | c7dfdbd8467f6d60 | PMID:27736299 heterozygous diploid ppa2delta/ppa2+ | c7dfdbd8467f6d60 | PMID:27736299 heterozygous diploid sal3delta/sal3+ | c7dfdbd8467f6d60 | PMID:27736299 heterozygous diploid suc1delta/suc1+ | c7dfdbd8467f6d60 | PMID:27736299 heterozygous diploid wee1delta/wee1+ | c7dfdbd8467f6d60 | PMID:27736299 hta1-S129A,hta2-S128A | d2288b20e67eb62d | PMID:15226425 htAQ | d81e6d582d5b3b8f | PMID:25795664 LA1 | ec78aec4d193db1f | PMID:27837315 mdb1(105-624) | d471b5535e0570fe | PMID:26160178 mmi1delta | c757e60bdb79c5cb | PMID:26670050 nmt1-bag101 | 1912f002dd642ee1 | PMID:27966061 nmt1-bag101-BAG | 1912f002dd642ee1 | PMID:27966061 nmt1-bag101 hsp104delta | 1912f002dd642ee1 | PMID:27966061 nmt1-bag101 nmt1-pdr13 | 1912f002dd642ee1 | PMID:27966061 nmt1-bag101 nmt1-sks2 | 1912f002dd642ee1 | PMID:27966061 nmt1-bag101 Purg1-hsf1 | 1912f002dd642ee1 | PMID:27966061 nmt1-bag101 sks2delta | 1912f002dd642ee1 | PMID:27966061 nmt1-bag101 ssa1delta | 1912f002dd642ee1 | PMID:27966061 nmt1-bag101-UBL | 1912f002dd642ee1 | PMID:27966061 nmt1-bag102 | 1912f002dd642ee1 | PMID:27966061 nmt1-bag102-BAG | 1912f002dd642ee1 | PMID:27966061 nmt1-bag102-UBL | 1912f002dd642ee1 | PMID:27966061 nmt41-cdc13 | 30da765280e4ffe0 | PMID:12419251 otg1delta otg2delta otg3delta | 0e85b84df580612d | PMID:22988247 pap1.C278A | 6e348c63b7404153 | PMID:23525001 pap1.C285A | 6e348c63b7404153 | PMID:23525001 pap1.C532T | 6e348c63b7404153 | PMID:23525001 pCut3-5A,cut3-477 | 77f717dc87015fe5 | PMID:25520186 pCut3-5E-phosphomimetic,cut3-477 | 77f717dc87015fe5 | PMID:25520186 pof8-∆[289-402] | 72c1277f6497035f | PMID:29422503 pof8-∆[390-402] | 72c1277f6497035f | PMID:29422503 pof8-R343A | 72c1277f6497035f | PMID:29422503 pof8∆ rif1∆ | 72c1277f6497035f | PMID:29422503 pof8∆ with TER1 overexpression plasmid | 72c1277f6497035f | PMID:29422503 pof8-Y330A | 72c1277f6497035f | PMID:29422503 pof8Δ::kanMX6 | 72c1277f6497035f | PMID:29422503 poz1∆::natMX6 | 72c1277f6497035f | PMID:29422503 poz1::natMX6 pof8Δ::kanMX6 | 72c1277f6497035f | PMID:29422503 quaduple deletion | d91d901cb0a83596 | PMID:27151298 rad52Δ-D2::LEU2 pof8Δ::kanMX6 | 72c1277f6497035f | PMID:29422503 rap1/taz1 double deletion | 1f8a9a6747848375 | PMID:11676924 rap1::ura4+ | 72c1277f6497035f | PMID:29422503 rap1::ura4+ pof8Δ::kanMX6 | 72c1277f6497035f | PMID:29422503 rec11-5A | ed7f95ec599f51aa | PMID:25579976 rec11-5D | ed7f95ec599f51aa | PMID:25579976 Red1 H637I | f17c3d43d258f9f6 | PMID:21317872 sfr1N | f63b345319dc4c0b | PMID:22405003 smc6-74 3TLSdelta | 2874cbc5bcdadd84 | PMID:17277362 smc6-74 3TLSdelta B | 2874cbc5bcdadd84 | PMID:17277362 sop2-1/ sop2+ diploid | 7a6602af54fdea54 | PMID:8978670 SP1182 | 9442751303a5f327 | PMID:8497322 SP1183 | 9442751303a5f327 | PMID:8497322 SP1184 | 9442751303a5f327 | PMID:8497322 SP1185 | 9442751303a5f327 | PMID:8497322 SP1187 | 9442751303a5f327 | PMID:8497322 (Sst2) | 7881e95072db6f1d | PMID:26412298 sucl-D3 | 24b35cab14100485 | PMID:16453733 sup35-F592S | 466c4197f9cf80c6 | PMID:25519804 switch II | edfddf218bf755ae | PMID:12894167 T167Ecdc2-M63 | 7940bef1ad8a05ee | PMID:9790601 taz1-2::ura4+ | 72c1277f6497035f | PMID:29422503 taz1-2::ura4+ pof8Δ::kanMX6 | 72c1277f6497035f | PMID:29422503 tea2delta homozygous diploid | b19dfb2b27f82b53 | PMID:11018050 ter1∆ with TER1 overexpression plasmid | 72c1277f6497035f | PMID:29422503 TKO | a60531b182edb2d1 | PMID:18653539 trt1-2::his3+ pof8Δ::kanMX6 | 72c1277f6497035f | PMID:29422503 ts-cs | 6e175411383ea731 | PMID:3040264 (U15) | 7881e95072db6f1d | PMID:26412298 (U4) | 7881e95072db6f1d | PMID:26412298 (U5) | 7881e95072db6f1d | PMID:26412298 (U9) | 7881e95072db6f1d | PMID:26412298 wee1-50/wee1-112 heterozygous diploid | 91da8e4c4e7b6d48 | PMID:7262540 wee1-50/wee1-1 heterozygous diploid | 91da8e4c4e7b6d48 | PMID:7262540 wee1-50/wee1-3 heterozygous diploid | 91da8e4c4e7b6d48 | PMID:7262540 wee1-50/wee1-50 homozygous diploid | 91da8e4c4e7b6d48 | PMID:7262540 wee1-50/wee1-6 heterozygous diploid | 91da8e4c4e7b6d48 | PMID:7262540 wee1+/wee1-112 heterozygous diploid | 91da8e4c4e7b6d48 | PMID:7262540 wee1+/wee1-1 heterozygous diploid | 91da8e4c4e7b6d48 | PMID:7262540 wee1+/wee1-3 heterozygous diploid | 91da8e4c4e7b6d48 | PMID:7262540 wee1+/wee1-50 heterozygous diploid | 91da8e4c4e7b6d48 | PMID:7262540 wee1+/wee1-6 heterozygous diploid | 91da8e4c4e7b6d48 | PMID:7262540 xlf1 T180A,S192A | c23b5043b024b0a5 | PMID:25533340 (167 rows)

kimrutherford commented 5 years ago

but why would you want to add an (additional) name to a single allele genotype

Ah, I missed that you were talking about single allele genotypes. I understand now.

Could you rerun,

What query is this? Is it genotype names in each session?

ValWood commented 5 years ago

I get it now, this is for a collection of alleles. thats fine but what is the use case for single allele genotype. Here we just repeat the allele name:

ValWood commented 5 years ago

example

so if you run the query again could you separate single and multi allele. We can probably remove the single allele ones and then we do not need to display the field in this case ( it should be identical to the name?)

ValWood commented 5 years ago

Query is Dropbox/pombase/Chado/queries/all_genotype_names-2018-03-01.txt from https://github.com/pombase/canto/issues/1321

kimrutherford commented 5 years ago

what is the use case for single allele genotype

Would there ever be a name in a paper for a single allele+expression change genotype? Or for a single allele + background change genotype?

ValWood commented 5 years ago

I don't think we would want to give them a specific name.....(i.e nobody names them).

Well ,they probably have available genotype names which include the background, but we don't want to record these anyway, because we record the background (occasionally) separately.

kimrutherford commented 5 years ago

so if you run the query again could you separate single and multi allele

The query results are here: Dropbox/pombase/curation_tool/queries/single_allele_genotype_names-2019-10-16.txt Dropbox/pombase/curation_tool/queries/multi_allele_genotype_names-2019-10-16.txt

I don't think we would want to give them a specific name.

OK, sounds like genotype names for single alleles should go. Would it help to hide the name field in that case unless you're an admin? We could do that straight away. That would stop community curators adding them.

mah11 commented 5 years ago

@ValWood

If people have a used a "specific genotype name", then I would use this name to describe the genotype when I create it.

This perfectly describes what the genotype name field is for.

... why would you want to add an (additional) name to a single allele genotype?

Then don't use it for single-allele genotypes - it is optional!

@kimrutherford

Would there ever be a name in a paper for a single allele+expression change genotype? Or for a single allele + background change genotype?

I can't be 100% sure there are none, but I can live without capturing them if everyone else thinks it's not useful to have names for any single-allele genotypes.

ValWood commented 5 years ago

I think I favour blocking this field for single allele genotypes because alternative single allele names without expression should just override the default name (and I am betting my bottom $ that none of the current uses of this field include expression information).

My main issue with this filed is that its true intention is not obvious to users or curators ( other than Midori).

So I would like to block it's use for single allele genotypes (I think) but first I would like to see the list. Unfortunately, I can't access the dropbox file (My dropbox is not updating)

Could you paste the single allele list here? I don't think it's very long. If they look like alternative allele names we can fix them in the session. Ones which refer to diploid can be deleted if the diploid has been captured properly.

Cheers

kimrutherford commented 5 years ago

Could you paste the single allele list here?

I've attached it:

single_allele_genotype_names-2019-10-16.txt

mah11 commented 5 years ago

I think I favour blocking this field for single allele genotypes because alternative single allele names without expression should just override the default name

This doesn't make sense. Why (and how) would an allele name ever override a genotype name?

(and I am betting my bottom $ that none of the current uses of this field include expression information)

I'm sure that's not what Kim meant in https://github.com/pombase/canto/issues/2071#issuecomment-542461419 - it's not that anyone would necessarily put expression details in the genotype name. It means that you could assign a name to a genotype that consists of one allele plus its expression level, or plus a background, and give a different name when the allele comes with different expression and/or background:

"Fred" = yfg1-1 knockdown "Ginger" = yfg1-1 overexpression "Ralph" = yfg1-1 overexpression, cdc25-22 background

I still don't mind whether we allow or disallow genotype names for single-allele genotypes. I do want to try to clear up any confusion about what we're talking about before we decide.

ValWood commented 5 years ago

My point is that the field has mainly been used for alternative allele names and NOT for genotype names. Look at the list- most of these should be the primary name for the allele (i.e override the default name, or be an allele synonym

a463dbfc1c41ab45 PMID:22298427 gef2Δ 6e348c63b7404153 PMID:23525001 pap1.C278A 6e348c63b7404153 PMID:23525001 pap1.C285A 6e348c63b7404153 PMID:23525001 pap1.C532T cafc26135a141916 PMID:23133674 deltaPoz1-site cafc26135a141916 PMID:23133674 deltaBqt1/2-site 7b30a1cc7d87ace2 PMID:24478458 Nes1* 7d2c05ef7277f909 PMID:21633354 h2a.z-so 24b35cab14100485 PMID:16453733 sucl-D3 729fd40714dad360 PMID:25771684 scp1-M5 d471b5535e0570fe PMID:26160178 mdb1(105-624) 466c4197f9cf80c6 PMID:25519804 sup35-F592S c23b5043b024b0a5 PMID:25533340 xlf1 T180A,S192A ed7f95ec599f51aa PMID:25579976 Puhp1-HA-hhp1 ed7f95ec599f51aa PMID:25579976 rec11-5A ed7f95ec599f51aa PMID:25579976 rec11-5D

etc...

I will fix the ones in the list when I get chance, and see how many real examples of genotype names there are in here. If there are only a few I'm not sure it is worth having the field available for single allele genotypes. It's just confusing. See above...

ValWood commented 4 years ago

Actions

  1. change the text in the field from "short name" to "prefered genotype name"

  2. remove this option for single gene genotypes

  3. provide a list of where it has been used for single allele genotypes, and where appropriate rtecord instead as allele synonyms.

Does that work for everyone?

mah11 commented 4 years ago

Well, I don't love item 2 (per https://github.com/pombase/canto/issues/2071#issuecomment-542461419 and https://github.com/pombase/canto/issues/2071#issuecomment-543079886), but I am choosing not to die on that hill.

(It also has consequences for item 3: a name for single allele + its expression isn't necessarily a synonym for the allele alone.)

ValWood commented 4 years ago

As far as I can see single allele + its expression hasn't been used (but we can check that once we have the list). The main point is to move those which are allele synonyms to the allele synonym filed and see what is left.

let's do this part first. It might make 1&2 actions clearer...

ValWood commented 4 years ago

I can't remember what we decided here so we might need to run-through again. It is also causing issues in PHI-base (used for allele synonyms

not genotype, which caused more problems than it solves if the genotype has multiple versions with different expression and background)

Canto_ticket
jseager7 commented 4 years ago

I can't remember what we decided here

There were some action items proposed in a comment here – https://github.com/pombase/canto/issues/2071#issuecomment-557046062 – but I don't think anything has been implemented in Canto yet.

It is also causing issues in PHI-base (used for allele synonyms not genotype, which caused more problems than it solves if the genotype has multiple versions with different expression and background)

So the 'Name' field of a single-allele genotype is being used to list synonyms of the name of the single allele contained within it? That definitely doesn't sound like a good idea. If it's really necessary to include allele synonyms then we should have a separate field for capturing the synonyms (probably on the allele itself, and not the genotype).

ValWood commented 4 years ago

If it's really necessary to include allele synonyms then we should have a separate field for capturing the synonyms (probably on the allele itself, and not the genotype).

We do have this but it is a bit hidden

kimrutherford commented 3 years ago
  1. provide a list of where it has been used for single allele genotypes, and where appropriate rtecord instead as allele synonyms.

Is that needed for pombe or for PHI-Canto? For pombe, I attached the list in a previous comment: https://github.com/pombase/canto/issues/2071#issuecomment-542934799