Closed kimrutherford closed 3 years ago
How do they look right now?
How do they look right now?
Like normal multi-allele genotypes. The diploids are kind-of flattened out.
We should discuss this. Do we currently display diploids at all? (I can't remember). We have quite a few now I think? I was just curating some now, which made me wonder.
@kimrutherford Could you let us know how many are curated. This will give us an idea about how we should prioritise.
Do we currently display diploids at all?
I haven't worked out a good way to store the diploids in Chado: pombase/pombase-chado#734
I need to get back to that.
Could you let us know how many are curated. This will give us an idea about how we should prioritise.
There are 554 diploids spread over 104 sessions.
I haven't worked out a good way to store the diploids in Chado: pombase/pombase-chado#734
Sorry I've double checked and I had this wrong. I thought we weren't displaying diploids at all. Actually we're showing them but in a naff way.
We're showing the two alleles of the diploid but in a way that makes it look like it's a genotype with two loci. Here's an example: https://curation.pombase.org/pombe/curs/1cfea3361eac50c/ro/ https://www.pombase.org/reference/PMID:2792737
I was right that I haven't thought of a good way to store the diploids in Chado.
Once Chado and the web code can cope with diploids how should the genotype page look when it's a diploid genotype? Currently it looks like: https://www.pombase.org/genotype/map1-mutant-unknown-unknown-expression-not_assayed__map1-mutant-unknown-unknown-expression-not_assayed which isn't great.
FlyBase store genotypes and phenotype in a very different way to us that would make other things more difficult.
Can it be something very simple like a flag? So we would just have a diploid flag on a single or multi allele to mark them to be displayed in a separate section. Or is it much more complicated?
Can it be something very simple like a flag?
I think that's how we'll have to do it. Maybe a property on each allele that is part of a diploid so we know it's part of a diploid. I'll sleep on it and see what I can do tomorrow.
It's not a huge hurry, since they are displayed people can see them and will know what they mean even without their own section.
One simple improvement would be to make this slightly more obvious at first glance by using the standard diploid notation
map1-mutant (unknown) / map1-mutant (unknown) instead of
map1-mutant (unknown) map1-mutant (unknown) (which could be interpreted as 2 copies of a mutant allele in a haploid, especially for heterozygotes)
One simple improvement would be to make this slightly more obvious at first glance by using the standard diploid notation
Unfortunately that's not possible yet. The website code can't tell which alleles are part of diploids because that information isn't stored in Chado. The web code just knows which alleles are part of which genotype. We need to store the diploid details in Chado first.
Maybe a property on each allele that is part of a diploid so we know it's part of a diploid.
Now I've looked at this properly I remember the fiddly bit. Each allele can be part of multiple genotypes and could be a diploid in some case but not others. That can be represented in Chado but as I say it's a bit fiddly.
So it isn't possible to just add a diploid flag to diploid genotypes? or is that over-simplistic?
So it isn't possible to just add a diploid flag to diploid genotypes?
That would work if all the diploid genotypes had a single locus. If there are 2 or more loci we need to record in Chado which allele is part of which locus. I think that's possible. I'm having a think about how to implement it in the least hacky way.
Yes - sometimes they are multi allele diploids.
Should we rename the "Single allele phenotype" section of the gene pages to "Single locus phenotype"?
Or should all diploid genotypes go into the "Multi-allele phenotype" section?
Should we rename the "Single allele phenotype" section of the gene pages to "Single locus phenotype"?
I think this makes most sense. At least, I would prefer to see single locus diploids in the same table as single alleles.
@mah11 ?
We need to decide how to format the allele table at the top of genotype pages when the genotype is diploid.
Here's an easy example from: https://curation.pombase.org/pombe/curs/1534156512e84afb/ro/
ecl1Δ / ecl1Δ
ecl2Δ / ecl2Δ
ecl3Δ / ecl3Δ
Currently displayed as: https://www.pombase.org/genotype/ecl1delta__ecl1delta__ecl2delta__ecl2delta__ecl3delta__ecl3delta but there's an obvious we to format it nicely.
This is a trickier one from: https://curation.pombase.org/pombe/curs/1ebe4a1548de4aff/feature/genotype/view/14/ro
This one could be:
ecl1Δ /ecl1Δ, ecl2Δ/ecl2Δ ,ecl3Δ /ecl3Δ
Gene | Allele name | Description | Type | Expression |
---|---|---|---|---|
ecl1 | ecl1Δ | deletion | Null | |
ecl2 | ecl2Δ | deletion | Null | |
ecl3 | ecl3Δ | deletion | Null |
Perhaps - no need to repeat the dupicated alleles in the list?
For the second example:
dot2-439(Q33->stop)/dot2-439(Q33->stop), nmt81-pcp1(nmt81-pcp1+)[Knockdown]/nmt81-pcp1(nmt81-pcp1+)[Knockdown]
(so if there is line wrapping, split on the diploid allelle pair )
Do we need to display the type ("nonsense_mutation"/"other")?
on annotation table names:
Should we rename the "Single allele phenotype" section of the gene pages to "Single locus phenotype"?
OK
Or should all diploid genotypes go into the "Multi-allele phenotype" section?
No; I'd prefer to rename the "multi" table to "Multi-locus phenotypes".
I think it's sensible to split the display into two tables (single- & multi-locus, each with haploids and diploids included). It would not make sense to have a table just for single-locus haploids unless it was one of four tables (a separate one for each combinatorial possibility) and I don't see any enthusiasm for that.
genotype page displays:
ecl1Δ/ecl1Δ, ecl2Δ/ecl2Δ, ecl3Δ/ecl3Δ dot2-439(Q33->stop)/dot2-439(Q33->stop), nmt81-pcp1(nmt81-pcp1+)[Knockdown]/nmt81-pcp1(nmt81-pcp1+)[Knockdown]
Yes, the header should include the /
between alleles of each locus (without spaces).
Perhaps - no need to repeat the duplicated alleles in the list?
I agree, it looks a bit silly, especially since the table doesn't group the two alleles of any locus visually. For heterozygous diploids there'll still be two entries because the alleles will be different. As long as the table is just meant to be a list of the alleles present in the genotype that'll do. (I'm thinking about whether it's worth explicitly saying "homozygous" or "heterozygous" somewhere but I don't think it's hugely important.)
Do we need to display the type ("nonsense_mutation"/"other")?
In the header, no. In the table, yes -- we should show it for info, and because we include it on the pages for haploid genotypes. Consistency, ya know.
Could one of you make a mock-up of how the table should look in the diploid case? Thanks.
The change to the genotype display names is now live. eg. https://www.pombase.org/genotype/nmt81-pcp1-nmt81-pcp1%2B-other-expression-knockdown_nmt81-pcp1-nmt81-pcp1%2B-other-expression-knockdown__dot2-439-Q33-%3Estop-nonsense_mutation-expression-not_assayed_dot2-439-Q33-%3Estop-nonsense_mutation-expression-not_assayed
I need to tweak the URLs for the genotype pages so if you get a 404 error from that link try from the publication page: https://www.pombase.org/reference/PMID:15992541
For now all the diploid genotype go into the "Multi-allele phenotype" sections.
Here are some more examples: https://www.pombase.org/reference/PMID:19911044
Could one of you make a mock-up of how the table should look in the diploid case?
Eurgh, here's a go, using an example based on this one, with a change to include a heterozygous locus in the mock-up...
One other idea: maybe leave some details out of the header? e.g.
mek1Δ/mek1Δ mre11-D65N/mre11-D65N rad50S/rad50+
or, with everything (as they currently look), and including a table:
mek1Δ/mek1Δ mre11-D65N(aa)/mre11-D65N(aa) rad50S(K81I aa)/rad50+
Diploid genotype
Locus | Allele name | Description | Type | Expression |
---|---|---|---|---|
mek1 | mek1Δ | deletion | Null | |
mek1Δ | deletion | Null | ||
mre11 | mre11-D65N | D65N aa | amino_acid_mutation | Not assayed |
mre11-D65N | D65N aa | amino_acid_mutation | Not assayed | |
rad50 | rad50S | K81I aa | amino_acid_mutation | Not assayed |
rad50+ | wild_type | Wild type product level |
another take on the table, done in a word processor & saved as .png:
to group the alleles of one locus more clearly than I could do in markdown
It seems less bad to show two copies of the same allele if it's clear that they're included as the two copies of one locus.
Yes, the shading helps.
Thanks Midori. I'll have a go at that.
Just to be clear, should we keep the table the same as now in the haploid case?
should we keep the table the same as now in the haploid case?
I suppose so ... it does the trick and nobody complains about it.
another take on the table, done in a word processor & saved as .png:
I've done that. The change will be in the main site in the morning. I haven't changed the header yet.
The change will be in the main site in the morning.
I haven't updated the main site because I want to hear your opinion about: https://github.com/pombase/pombase-chado/issues/824#issuecomment-810759099
This looks good to me.
I notice on the gene page that single diploids are included in multi allele table. Do we plan to move these into and rename as 'single locus table'? Also, dot1 might be a good example to use for the announcement when it happens, but we should review the existing genotypes to check whether some should be diploids first. I just swapped over https://www.pombase.org/reference/PMID:11901107 to heterozygous diploids.
I notice on the gene page that single diploids are included in multi allele table. Do we plan to move these into and rename as 'single locus table'?
We can do that. I think we're all agreed: https://github.com/pombase/website/issues/1328#issuecomment-808125933
Should I go ahead?
We can do that. I think we're all agreed: #1328 (comment) Should I go ahead?
I thought we had agreed. I just couldn't remember which parts applied to the website and which to the curation tool ;)
I don't see why not...
another take on the table, done in a word processor & saved as .png:
I've done that.
That looks nice!
I thought we had agreed. I just couldn't remember which parts applied to the website and which to the curation tool ;)
Website agreed in this very ticket, e.g. this comment; Canto here.
I don't see why not...
It will appear as "Single locus phenotype" / "Multi-locus phenotype" everywhere, including the left hand menu and in the query builder. We'll need to update the documentation to match.
Is that all OK?
Yeah, I'll cope. I'll put the documentation on my post-break to-do list.
I'll put the documentation on my post-break to-do list.
Great. Thanks.
I've made all the necessary changes locally. I'll hold off deploying things until you have a chance to change the documentation. There's no rush at all because next week I need to get back to working on Canto.
After this is done, does it make sense to add a query builder option to constrain phenotype searches by diploid/haploid/either? We can already constrain by single/mutli allele (although that's about to change to single/multi locus).
I guess it could be useful. At that point we could announce the ability to curate diploids properly. That we display then, and that they can be queried. Also ask people to let us know if they spot any diploid alleles curated as haploid..
I guess it could be useful.
OK, I'll add that at the same time as changing all the sections to single/multi locus.
Let me know when it's done so I can get screenshots for the documentation :)
This should be ready for testing on Monday. I'll release the changes in the dev site and delay updating of the main site until we're happy with the changes and the documentation is OK.
If the "Diploid" box isn't ticked should we label the check boxes as "Single allele"/"Multi-allele"? Or is it more consistent to stick to "Single locus"/"Multi-locus"?
Also should we change the "Expression level:" label to "Expression level for single allele haploid:" to be more explicit?
The dev site is updated to have the single/multi locus spilt. Let me know if you spot any problems. I won't update the main site with these changes until you're happy with things and until the documentation is ready.
Search stuff ...
Big picture: looks good, and seems to be working.
I don't love that the hyphen is missing from "multi-locus" in the query description (in the history; it's been fixed elsewhere), but I guess that's me being finicky and pedantic.
If the "Diploid" box isn't ticked should we label the check boxes as "Single allele"/"Multi-allele"? Or is it more consistent to stick to "Single locus"/"Multi-locus"?
I don't have strong feelings, but I'm a bit inclined towards leaving it as is for consistency.
Also should we change the "Expression level:" label to "Expression level for single allele haploid:" to be more explicit?
Can't hurt.
Page display stuff ...
The main thing I've spotted is that haploid and diploid genotypes are interspersed in the phenotype annotation tables. Is that intentional? I thought we envisioned having all of the haploids and then all of the diploids in each table, and it'd be good to do that before release if it's not a huge amount of work.
The diploid genotype pages look good. :)
I don't love that the hyphen is missing from "multi-locus" in the query description
That's fixed now.
I thought we envisioned having all of the haploids and then all of the diploids in each table,
I don't remember talking about that. I'll have a go tomorrow.
I thought we envisioned having all of the haploids and then all of the diploids in each table,
I don't remember talking about that. I'll have a go tomorrow.
It's entirely possible that we only discussed in it the context of Canto, or even that I've been thinking it without saying or typing it until now. If the latter, oops, sorry. It'd still be good to be able to spot diploids easily, so if grouping them within the tables turns out to be a pain I'll open a ticket for a filter (hmm, might do that anyway).
... once we have some in Chado (pombase/pombase-chado#734).
I think this needs a short discussion.
This might be related: #1327