pombase / website

PomBase website v2
MIT License
6 stars 1 forks source link

what to do with counts #168

Closed ValWood closed 7 years ago

ValWood commented 7 years ago

Meant to open this one yesterday...

We decided:

Phenotypes are more problematic. We definitely need access to "number and list of genes annotated to this term" I think the consensus was that it might be best to do this from the term page where there would be a number of detailed options.....

This also gets around the rather horrid thing we have at the moment were the count and the term link to the same page...

Is this the decision?

mah11 commented 7 years ago

eh, it sounds like a reasonable set of things to try

Antonialock commented 7 years ago

I asked Jurgs group what they thought about the count page...it seems like they would prefer counting alleles? I don't agree for my own purposes :-P

Currently the count column gives you the number of genes annotated to a phenotype term, but we are talking about changing this to showing the number of alleles that are annotated to this term.

  1. do you think count is useful (in its current form)?
  2. do you think it is more useful to get the number of genes or the number of alleles?
  3. would it bother you if the count information was "1 click away" (i.e. either on a separate page or in the "full view" as opposed to the "summar view"

Shajahan: I used that feature to look for associated genes. Using the gene itself was helpful when I knew few genes in a pathway which I am interested in and want to know the GO term which might pull the associated gene list. However using the allele is more meaningful since one would expect to see the phenotype associated with a similar allele of the other gene in the list.

John: I agree with Shajahan. I’ve often had cases where I want to find other genes with similar phenotypes to one of my KO mutants. But when I use the count feature, it lists genes for which any allele of that gene gives rise to the phenotype, which can be complete nonsense. Or even worse, often the KO has an opposite phenotype to the overexpression mutant, so I get genes which are the complete opposite to what I am looking for.

Antonia: Aha ok. I assumed that wasn't an issue because I'd think that you want to catch all genes that are involved in a process

So, for example, say you have an activator geneA, and an inhibitor geneB. deletion of geneA leads to abnormal meiosis over-expression of geneB leads to abnormal meiosis

Wouldn't you want to pull out both genes when looking at genes annotated to "abnormal meiosis"?

Sorry if I'm being dim but what is the reason for wanting to filter on specific allele types?

John: I think it depends on what it is that you're looking for. I completely agree that often it is useful to find all genes associated with a particular phenotype. But there are also times where this is not relevant - for instance, if I wanted to find a deletion mutant with a particular phenotype, then getting a list of overexpression mutants with that phenotype is just a nuisance. So its tricky because I think both are useful under different circumstances.

ValWood commented 7 years ago

I think this is conflating an number of problems.

they seem to be talking about finding all alleles which give a specific phenotype for a particular gene, and the ability to filter on allele type (over expression, etc.), for a particular phenotype...

However, proposed changes (moving counts to term page and allowing access to all lists, plus filter options) should make everyone happy....

ValWood commented 7 years ago

both have use cases....

kimrutherford commented 7 years ago

I've done the first step. I've removed the counts column from the summary view and hidden it for the phenotypes full view.

How should the term details page look?

ValWood commented 7 years ago

Lets come back to this on tomorrows call.

ValWood commented 7 years ago
  1. add link to the basic redundant gene list (either genes annotated, single gene allele phenotypes (genes), multi allele phenotypes (genes) (maybe later if required)

    • [x] 2. Kim was correct, putting the gene in the first column in the summary view looks odd suggest instead: phosphoprotein phosphatase activity clp1, dcr2, ibp,

for summary view....

  1. Full view GO:0004721 phosphoprotein phosphatase activity 34 clp1 has substrate cdc10 IDA Chen JS et al. (2013) clp1 has substrate ase1 part of positive regulation of mitotic spindle longation during mitotic anaphase B

Sorry!

kimrutherford commented 7 years ago

add link to the basic redundant gene list (either genes annotated,

I've added back the link to the gene list.

single gene allele phenotypes (genes), multi allele phenotypes (genes) (maybe later if required)

I haven't done this bit yet. The link is there on the phenotype term pages but the list includes all genes for all genotypes annotated with that term. I'm working on fixing that.

The link text is currently "View genes annotated with this term ..." That makes sense for non-phenotype terms. What should the link text be for the list of genes for single allele phenotypes on the phenotype term pages?

ValWood commented 7 years ago

I think we are getting there, this is good.

I have no problem with "View genes annotated with this term ..."

but maybe "View genes associated with this phenotype term... "View genes associated with this GO term ...

kimrutherford commented 7 years ago

"View genes associated with this phenotype term..."

The reason I asked is that we were thinking of (initially) only including the single allele phenotypes so we won't be including all genes associated with a phenotype. So that means saying "genes associated with this phenotype term" isn't going to be accurate.

ValWood commented 7 years ago

Got you.

"View genes associated with this term via single-allele phenotypes" and later "View genes associated with this term via multi-allele phenotypes"

We have room to be explicit, but maybe Midori can make more succinct...

mah11 commented 7 years ago

The most accurately descriptive text I've thought of so far is "genes from single-allele genotypes associated with this term" (& swap "multi-allele" for "single-allele" if we include the second link).

kimrutherford commented 7 years ago

I've added the link on the phenotype terms pages to the list of genes from the single allele genotypes.

Here's an example: http://pombase2.bioinformatics.nz/term/FYPO%3A0001931

How does that look?

kimrutherford commented 7 years ago
  1. annotations (list)
  2. annotations showing indirectly annotated terms

We have 7. at the moment. Would you like 6 (no indirect annotations) as an option?

kimrutherford commented 7 years ago

putting the gene in the first column in the summary view looks odd suggest instead: phosphoprotein phosphatase activity clp1, dcr2, ibp,

Do we need a single allele phenotype table and a separate multi allele table on the term details page? With the list of genes in the summary for the single allele annotations and a list of genotypes in the multi-allele case?

ValWood commented 7 years ago

We have 7. at the moment. Would you like 6 (no indirect annotations) as an option?

I think we are covered. I don't think there is a use case for "no indirect annotations". Its arbitrary, the inferred descendants really could be annotated directly to the term too....

ValWood commented 7 years ago

I still don't think its useful to mix single and multi allele lists on the term pages. Maybe when gene is changed to genotype as planned, it will be clearer what to do next?

ValWood commented 7 years ago

Do we need a single allele phenotype table and a separate multi allele table on the term details page? With the list of genes in the summary for the single allele annotations and a list of genotypes in the multi-allele case?

I think we do need separate tables...

ValWood commented 7 years ago

How does that look?

yes, good...

kimrutherford commented 7 years ago

putting the gene in the first column in the summary view looks odd suggest instead: phosphoprotein phosphatase activity clp1, dcr2, ibp,

I've done this now. It looks a bit odd. Was there a plan to hide the extensions in the summary view on the term pages? Was that just for phenotypes? I can't find where we talked about that.

ValWood commented 7 years ago

The ones I looked at for MF look ok http://pombase2.aska.gen.nz/term/GO%3A0004693 http://pombase2.aska.gen.nz/term/GO%3A0004674

but isn't working for processes http://pombase2.aska.gen.nz/term/GO%3A0010389

phenotypes OK too http://pombase2.aska.gen.nz/term/FYPO%3A0000080 (although it isn't clear which gene the "penetrance" applies to here.

in fact, for some phenotypes, like altered kinase activity http://pombase2.aska.gen.nz/term/FYPO%3A0001382 the extension is absolutely required for interpretation.

I think I prefer to see extensions in this view. I don't think it looks bad (its very similar to what we do in the summary view on the gene pages).

(for hiding extensions, we only plan to hide penetrance and expressivity in the summary view, because you need the genotype context to make sense of that info, and we don't have that in the summary view)

Would be useful if others could look and see if they think it looks odd though!

ValWood commented 7 years ago

Actually it is a bit odd, because at the moment you don't know which gene the extensions apply to?

It needs to be

protein serine/threonine kinase activity clp1, plo1 etc cdc2, has substrate orc2 involved in negative regulation of mitotic DNA replication initiation during mitotic G2 phase cdc2 has substrate cut7 , srw1 , sds23 , crb2 cdc2 has substrate clp1 involved in negative regulation of protein serine/threonine cdc2 se A

The gene needs to be connected to the substrate, if there are no substrates it can be listed at the top.

(maybe in this view we should include only substrates but exclude part of links? )

kimrutherford commented 7 years ago

Actually it is a bit odd, because at the moment you don't know which gene the extensions apply to?

Yep, that's why I asked if there was a plan to hide the extensions.

I'll change it to work as you describe.

ValWood commented 7 years ago

maybe extension could be hidden in summary and only visible in full view

@mah11 @Antonialock thoughts?

mah11 commented 7 years ago

No overwhelming preference, but I could live with extensions hidden in the summary view.

ValWood commented 7 years ago

We could wait to see what it looks like when "genotypes" are used, and the stuff related to this comment

Actually it is a bit odd, because at the moment you don't know which gene the extensions apply to?

It needs to be protein serine/threonine kinase activity cdc2, clp, plo1 etc cdc2, has substrate orc2 involved in negative regulation of mitotic DNA replication initiation during mitotic G2 phase cdc2 has substrate cut7

and then see if we need to hide extensions...

kimrutherford commented 7 years ago

cdc2 has substrate cut7

If there are multiple genes with the same extension should it be?:

cdc2, abc1, any1 has substrate cut7

That doesn't make it clear that it's not just any1 that has the extension.

Would this be clearer?:

protein serine/threonine kinase activity
    cdc2, abc1, any1
        has substrate cut7
   any2, wee1, ...
        (other extension ...)

How should it be formatted for genotype/phenotypes? Some of the genotypes are really long.

ValWood commented 7 years ago

Hmm, good point, not sure....I think I prefer totally unambiguous: cdc2 has substrate cut7 abc1 has substrate cut7 any1 has substrate cut7

(I don't think this will inflate too much)

interested what A&M think though.....

kimrutherford commented 7 years ago

interested what A&M think though.....

I'll hold off changing things too much for now.

kimrutherford commented 7 years ago

Let's move the discussion about how to show genes and genotypes to #184 and use this ticket to talk about how to show the counts.

ValWood commented 7 years ago

I'm going to summarize what is left in this ticket:

I'm going to close this one and open 2 new tickets for the outstanding items...

ValWood commented 7 years ago

Rationale was this ticket got to long and I didn't want to read it all again ;)

If I missed anything please open a new ticket...