geneontology / noctua-form-legacy

Simple annoton editor workbench for Noctua.
BSD 3-Clause "New" or "Revised" License
3 stars 3 forks source link

curator friendly view of annotations that have been entered #43

Closed krchristie closed 6 years ago

krchristie commented 6 years ago

Kimberly and I are finding the view of annotations in the SAE to be unintuitive and cumbersome to use.

There are some features that are really great, like highlighting where there are errors and providing a message about what is wrong.

However, the current view, which highlights the molecular function, rather than the gene, makes it much harder to identify the annotation you want to check. It is also impossible to see everything you have entered since you have to click VIEW to see all the details and can only see one box at a time. Thus, it is really hard to quickly confirm that qualifiers such as NOT or any extension data have been entered correctly.

For example, in this model of Kimberly's, there is no indication in the "Molecular Activities in the Model" section that "protein serine/threonine kinase activity" has a NOT qualifier and you can not see which ones have extensions entered until you click VIEW for each box. But it is really cumbersome to have to click through these boxes one at a time to see everything.

http://noctua-dev.berkeleybop.org/workbench/simple-annoton-editor/?model_id=gomodel:5a98684700000159

We are also completely mystified by what the "Components in the Model" section is supposed to indicate. It is clearly not showing just Cellular component terms as the terms listed here in Kimberly's model are from all three aspects. So, we are both rather confused by the intent and utility of this section.

We would really like something that shows everything, including extension data and qualifiers such as NOT when present, without having to click through things one at a time to see all the details.

We think the type of annoton view of entered annotations in the older SAE is a good start, but would need some tweaks. Here is a sample view of the old SAE view that illustrates some of what we would like. The fact that this model isn't built correctly such that it thinks I have put in a BP and a CC term instead of two CC terms, so is now showing both CC terms allows me to illustrate something I would want, i.e, that I want to see all the details of every annotation linked to the gene it is associated with.

oldsaeview

Additional features desired: 1. Label each row for aspect individually - Since it is possible to have annotons missing either BP or CC, and we are also discussing allowing multiple levels of nesting, at least for CC (https://github.com/geneontology/simple-annoton-editor/issues/42), I think it would be more useful to curators if each row was labelled with the letter for its aspect, rather than labelling the whole annoton with 'FPC' or 'FC', etc. Thus, for the image I have included above, the 'molecular_function' row would be labelled with 'F', while both of the following two rows would be labelled with 'C'.

2. Inclusion of qualifiers and extensions in this tabular view - For example, P2GO shows everything you have entered as part of the annotation, including qualifiers (none in this screenshot, though you can see the empty blanks in the Qualifier column) and extensions (the three yellow highlighted annotations each have two extensions). Ideally, I'd like to see all of the extensions shown in a more easily readable view, translated to term names, and all extensions visible in the annoton. p2goview-normalwidth

@vanaukenk - I think this represents what we discussed a few days ago, but please add anything I missed.

krchristie commented 6 years ago

I want to add a little more info about the paper (PMID:26909801) I was trying to curate that I used for the screenshots included above (both the SAE and the P2GO screenshots). I started trying to actually curate this paper in Noctua, generating this model, where the SAE shows BOTH of the CC terms I want since it thought the first one was a BP term. http://noctua.berkeleybop.org/editor/graph/gomodel:5a7e68a100000324

When I create the model "correctly", the SAE no longer shows the second CC term for "9+2 motile cililum". http://noctua.berkeleybop.org/editor/graph/gomodel:5a7e68a100000502

What curators would really want in the SAE is to see ALL of the annotations in the model in a simple, easy to see everything format, something like this where the entire annoton shows with all of the data, including a NOT qualifier for one, and all the extensions for the other.

tabularannotonview-wanot

I wanted to put the NOT annotation from Kimberly's model into this, but the link no longer loads anything for me, so I put in a made up NOT annotation, just to have an indication that we need to see this easily. I realize that some discussion is probably necessary about how to handle the old qualifiers (NOT, colocalized_with, and contributes_to) versus all of the new ones. It is really essential to see a NOT qualifier. Also, if people have imported old annotations into the model that have old qualifiers, I think they need to seem them.

thomaspd commented 6 years ago

suggest changing text "COMPONENTS IN THE MODEL" to "CC ONLY ANNOTATIONS"

tmushayahama commented 6 years ago

@krchristie @vanaukenk @thomaspd the curator friendly view of annotations is now the table representation (picture below). Included is all of the above suggestions, a combination of P2GO and the old SAE. Let me know what you think and if additional columns are needed. The demo is on the http://68.181.125.145:8910/ server

image

This table is like a tree which is expanded by default (let me know what you prefer for default). A nested extension is represented by the indentation of the +/- icon on the first column.

For a macro-molecular complex, a macromolecular complex GO term is displayed on the gene product column and nested are the haspart gene products

This table is configurable, you can

The table is still a work in progress. Todo list

The heading for the third column is blank, any suggestions? Labeling it "extension" like P2GO might be confusing. Should there be a column called extension?

krchristie commented 6 years ago

Thanks for starting work on this so quickly! It's a good start, but I have some comments that to incorporate the features of P2GO and the existing tabular form in the SAE for the "live" Noctua that MGI gets some annotations from.

  1. Show everything without needing to click or scroll side to side - I want a view where I an see everything without having to scroll side to side or click to open things one at at time, Thus, I find several features of the P2GO table interface much more curator friendly than this table. This is true even when I put the P2GO table on my 15 inch laptop screen, instead of on my big display monitor. (If you do not have access to P2GO, I would be happy to set up a time to talk and do a screen share to show the features that make P2GO really curator friendly). I'm fine with having a minimum width such that if you make the window really small, there is eventually a scroll bar, but that once you are above that minimum size, you can make the window wider and the table will expand out to that size so that columns like the GO term name get sider and don't need to wrap as much. To summarize:

    • I would like for the table to NOT be constrained in width as it is frustrating to have to scroll side to side to see everything, instead of being able to see everything all at once. I would like the table width to scale to the size of the window.
    • I would like for things like GO term names to wrap within their cell instead of being truncated so that find in page will show all locations of a search term (right now, it finds but cannot highlight text that is truncated from view).
  2. Shading of annotons versus rows - I strongly prefer the shading of the table of made annotations in the SAE for the current in-use version of Noctua, where the shading corresponds to annotons, rather than to individual pieces. This is much more useful for seeing the things that I have entered together, e.g. this model: http://noctua.berkeleybop.org/workbench/simple-annoton-editor/?model_id=gomodel:5a7e68a100000324

  3. Column contents and order - I can not suggest a name for currently unnamed column because it is mixing things (GO term aspect, relationships, and "Gene product") that I don't want to see in a single column, and is also duplicating the aspect info for GO terms that is already present in a dedicated column. I also don't understand the purpose of showing me that I have annotated each gene product to itself as a gene product and this is redundant with the Gene Product column. I would prefer separating out this info into different columns for different types of information so that a curator easily see which types of information are present in defined locations for each type. These are the columns and ordering I'd like to see:

    • Entity annotated (ID) - Consider renaming the "Gene Product" column to "Annotated Entity" to reflect the fact that this data can include PRO IDs for splice isoforms or modified forms, as well as gene product IDs
    • Relationship (of MF to GO term) - Make the current untitled column to show ONLY relationships between the MF in the annoton and other GO terms, e.g. part of.
    • Aspect - Move the Aspect column to the left of the Term column
    • Qualifier - Also move this column to the left of the GO Term column since it is a modifier of the term
    • GO Term (ID) - limit this column to only be for GO terms
    • Relationship (of GO term to extension) - have a second set of columns for the relationship to the extension...
    • Extension - and the extension itselv
    • Evidence
    • With/From
    • Reference
    • Assigned By
    • Date - This might not matter in a model being entered from one paper, but will be useful if older annotations are imported into the model

Here is a screenshot of an Excel file where I've gone through some of the annotons from three different models and a made up NOT annotation to confirm that the table format I am proposing and I think it works for all of the kinds of extensions that are available in the SAE. Note that for annotons where all the evidence and reference was identical for all rows, I have only shown it once per annoton. While for the third model, the second annoton for Ptk2b has complicated evidence, so I have included the different evidence lines for each row in the annoton.

tablewithsamplemodelannotons

krchristie commented 6 years ago

This model:

http://68.181.125.145:8910/workbench/simple-annoton-editor/?model_id=gomodel:5ab581e800000496

highlights another issue of the Table View of the annotations entered in the SAE.

It is only showing the annotations for Ift88 that I entered via the Default form (despite the fact that the lack of evidence on the MF term means that these annotations are missing from the Annotation Preview and won't export to the GPAD either). However, it is NOT showing the CC only annotations for Ift20.

I want a Table of Annotations, NOT a table of just "Molecular Activities in the Model", so I want this table to include ALL my annotations, including CC only annotations. I don't even mind if you have to put some sort of "molecular_function" placeholder into the table for this to work. If it's useful for curators, these placeholder unknown MFs could be shown in gray, or strikethrough, or some way of indicating this is a placeholder rather than a direct annotation.

vanaukenk commented 6 years ago

@krchristie Looking at the table view again, I am wondering if having a view like the Annotation Preview would work here. The version on the table editor wouldn't contain all of the inferred annotations, but it would show everything that's been entered via the table in roughly the same format.

krchristie commented 6 years ago

@vanaukenk - I think that although the Annotation Preview is a great way to see what the GPAD is going to look like, it isn't what I want here. I really liked the Annoton view Seth has in the http://noctua.berkeleybop.org/ version of Noctua, except that it leaves out the extensions. I think it is very useful to see the things you've entered together in a group, and the Annotation Preview doesn't do that.

tmushayahama commented 6 years ago

@krchristie SAE (or now Activity Creator Form) table has been improved. Let me know if it now curator friendly image

Now you can see

Working on the assigned by column

krchristie commented 6 years ago

@tmushayahama - This is really looking good :) Sorry a little slow to get to this. I had a 48 hour deadline to correct proofs for a publication earlier this week.

  1. I was originally thinking that all of the GO terms should be in columns 2-4. In a model I am looking at now, I am seeing that when there are two BP or CC terms, the second BP or CC shows up in columns 4-5 [Relationship(ext) and Extension]. I also notice that for the second CC term (look at the '9+2 motile cilium' CC term in either the Dnah2 or Dnah7c annotons), in the table in the SAE, it is showing the relationship 'occurs in', which I think is correct if we are showing the relationship between the MF term and this second CC term. However, in the canvas, what is showing is a part_of relationship between the first and 2nd CC terms. I think having this be different might be confusing. Thus, I think it might make more sense if we left the 2nd CC term in the Extension column, but in the Relationship(ext) column show its relationship as an extension, and possibly also showed its relationship to the MF term in the 2nd column (Relationship). @vanaukenk - What do you think? Maybe we should discuss this in a small group call.

See this model: http://68.181.125.145:8910/workbench/simple-annoton-editor/?model_id=gomodel:5a7e68a100000324

  1. I am wondering if having the 'enabled_by' relationship in the same column with the other relationships is confusing because it is easy to read the line from left to right and end up reading it as [gene product] enabled_by [molecular function]. Perhaps enabled_by should be moved to be in the cell with the gene product (when there is a MF), and the Relationship (col 2) would be blank for the MF line. Also, maybe, similarly to the name of the Relationship(ext) column, this one should be called Relationship(MF).

  2. I realize that much of this model was made on the canvas BEFORE I learned that I was connecting something incorrectly for what I wanted to say. However, it points out something that we might want to fix. For Dnah11, the table is indicating that "9+0 motile cilium (GO:0097728)" is a P(rocess) term, but it is really a Component term. It seems that it would be better to look up the namespace for the term to assign its aspect, and then you might be able to also flag that this model isn't built correctly. Then, it is also putting a 'P' in the Aspect column for the row that has the anatomy term as an extension.

  3. Why is there anything in the With field for the P term for the Dnah2 and Dnah7c annotons? I didn't enter anything into the with field for either of these, both of which were entered using the DEFAULT form.

tmushayahama commented 6 years ago

@krchristie in SAE, an extension is something not the MF, first BP or first CC. any nested term from this is considered as an extension. It came from this https://github.com/geneontology/simple-annoton-editor/issues/12

For enabled_by, I can put 'enables' instead.

Any comment on this @thomaspd @vanaukenk

vanaukenk commented 6 years ago

@krchristie @tmushayahama @thomaspd

Yes, we had debated about exactly what relations to show in the summary view. Right now, the relations are the same as what is shown in the graph view, i.e. the MF is enabled_by an entity, that MF is part_of a BP, and occurs_in a CC. I think this does get confusing to curators, but if we switch the relations to use the gene product-to-term relations we would then have: enables, involved_in, and part_of for MF, BP, and CC, respectively. This essentially becomes more like a GPAD output, though, than like the graph view. I'm a little concerned about switching back and forth between the different representations and how best to make sure curators really understand the different statements that are being made in the form vs the summary table vs the graph editor vs the GPAD. I think a small group call perhaps also looping in Suzi A. might be best. @krchristie - does tomorrow at 10am PST work for you?

krchristie commented 6 years ago

@vanaukenk - Sorry, I am not available from 9:45-11:15 am Pacific time on Tuesday. Other than that time slot, I am generally available on Tuesday.

vanaukenk commented 6 years ago

Okay, let's see if we can schedule it for later in the day. @tmushayahama @lpalbou @thomaspd Would 11:30am PDT tomorrow work? I'll email Suzi A.

tmushayahama commented 6 years ago

@vanaukenk we have a group meeting at 1PM (PST), so we can have it on 11: 15AM (PST)

krchristie commented 6 years ago

As discussed in our call today (@vanaukenk @tmushayahama), we would like to add a column in the SAE Table View for the annotation date.

krchristie commented 6 years ago

Here's the summary of the things we agreed to do based on Karen's comments This was our model to discuss these issues: http://68.181.125.145:8910/workbench/simple-annoton-editor/?model_id=gomodel:5a7e68a100000324

  1. When there is a second CC term (look at the '9+2 motile cilium' CC term in either the Dnah2 or Dnah7c annotons), in the table in the SAE, the relationship in the Relationship(Ext) column should be 'part of' to indicate the relationship between the first and 2nd CC terms, not 'occurs in',

  2. We decided to move the 'enabled_by' relationship to be in front of the gene product in the Annotated Entity column, e.g. enabled by Dnah11 Mmus (MGI:MGI:1100864)

  3. When there are errors, the Table view of the SAE needs to display an indication similar to the indication present in the Grid view. See the Dnah11 annoton for one with errors.

  4. We are not sure that it is appropriate to put this type of model "with" info into the traditional "With/from" field. What is here is not very human readable. @vanaukenk - I'm not sure if we came to a decision on this one, so please add that if there is an action item on this one.

krchristie commented 6 years ago

With respect to #2 above, we also discussed modifying the column header slightly to say "Relationship (to MF)". I think this would help to educate/remind curators what the relationship in this column are, but I am not sure if we agreed to do this.

vanaukenk commented 6 years ago

@krchristie - for the column header, do you mean the second column currently titled "Relationship"? If so, I think we may need to stick to the more generic "Relationship" so that annotation relations from the CC only form also make sense.

krchristie commented 6 years ago

@vanaukenk - No, I think that the second relationship column, which is titled "Relationship(ext)" is good as is.

I was wondering if changing the title of the first relationship column, the one between Annotated Entity and Aspect from "Relationship" to "Relationship (to MF)" would help remind and educate curators that this is the relationship to the MF term, not to the GP.

krchristie commented 6 years ago

As I mentioned in https://github.com/geneontology/simple-annoton-editor/issues/58, I think that it is misleading for the Table view to assume the aspect of the term based on the relationship used. I think it would be more obvious what is going on if the Table view showed the actual aspect of the term that is present in the row.

krchristie commented 6 years ago

It would be really helpful to curators if the With column in the Table view translated the entities in the with column into "term name (ID)" similarly to how it is done in the Annotated Entity column. I realize that the With column has additional types of entities than the Annotated Entity column, so there might be some IDs that we might not currently have the info to translate (e.g. MGI allele IDs), and that's OK, at least for now. However, to the extent possible, it would be really helpful if the With field was human readable about what gene is in the field.

tmushayahama commented 6 years ago

Here is the improved table according to https://github.com/geneontology/simple-annoton-editor/issues/43#issuecomment-382124338 + "Relationship (to MF)" column image

'enabled by' in Annotated Entity column should have a distinct look so that gp stands out. So I just faded it a lil bit, but let me know. Just another thought (less important), should all the relationships text be like this (faded small size) so that at a table glance you can just visualize it?

The above table shows activities separated with the grey line between them. If you like this, I can remove the alternating colors. Eventually SAE wants to use the table cell colors for other things like errors, grouped evidence, CC Only etc.

However, enabled by looks dangling, so Experimenting with table, does any of the below a potential?

Putting MF above GP image

MF on left and bold Annotated Entity text image

MF on left, bold Annotated Entity text and knock down 'Relationship to MF' column and combine it with MF

image

Let me know @krchristie @thomaspd @vanaukenk @suzialeksander

krchristie commented 6 years ago

1. On visualing annotons - I really like the gray bars between annotons and think that might be sufficient to distinguish annotons without using the alternating shading.

2. On the position of the "enabled by" relationship - I also really like the way you have currently implemented the position of the "enabled by" relationship within the Annotated Entity cell a lot. Reducing the size of "enabled by" compared to the Gene symbol looks great! and really helps draw the curators eye to the gene symbol as the main thing.

I definitely prefer the way we have currently implemented the position of the "enabled by" relationships in the cell with the Annotated Entity over any of the options you mocked up above. I think they all suffer from some of the things we discussed yesterday, separating the evidence from its connection between gene product and MF term. I am really not keen on the views that put the MF in the first column.

vanaukenk commented 6 years ago

I agree with @krchristie I like the gray bars that separate the annotons and think that is sufficient. I'm also happy with the current implementation of the table wrt the position of the gene/gene product and the view of the relations. Overall, I think it's easier to read than the other options and the information we wanted to convey, e.g. the relationship to MF, it clear. Thanks, Tremayne!