Closed andrewhercules closed 5 years ago
One of the use cases for providing the orthologues to human targets is the possibility of using this information in pre-clinical stages of drug discovery (safety/toxicity experiments in mouse, rat, dogs, Rhesus macaque).
Do we have information from users that they would like to find what the different types of orthology are? e.g. 1:1, 1:many, many:many. How is the information going to be used by them?
Wouldn't the binary options e.g. yes, the gene has orthologues versus no, the gene has no ortholog in other vertebrates suffice?
Ensembl provides the orthologues for as many as 146 species (including non-vertebrates). Which ones are we going to pick?
As per paralogues, I'm struggling to see the use case in drug discovery. Is there one? I could think of these scenarios but not sure if they are real scenarios
can my target be not tractable but its paralogue (a gene that arose from duplication in the human lineage) be tractable?
if they have similar function how can my drug modulate one, both or all paralogues potentially out there.
We currently display different types of orthologies based on a call to the Ensembl /homology
endpoint and we pass through codes for different species as parameters - see here for a list of species included in the call.
As for paralogues, we also already include that data in the data table view - although the mapping may need to be updated because we only map three homology type values but Ensembl actually has a more extensive list of types that they return in that endpoint:
ortholog_one2one ortholog_one2many ortholog_many2many within_species_paralog other_paralog gene_split between_species_paralog alt_allele homoeolog_one2one homoeolog_one2many homoeolog_many2many
Given the discussions on ticket #538 about the range of data returned by Ensembl's /homology
API endpoint , I have updated the design spec and made the following changes:
Orthologues and paralogues for GENE-SYMBOL that have been identified across a selection of 12 different species
1 to 1 orthologues
, 1 to many orthologues
, and human paralogues
@andrewhercules, we could show a boolean table of homology type vs species here. It would be less wordy and we could use the species icons?
Also, is Homology a clearer name than Gene tree?
@peatroot, I had considered using the species icons, but I would like to keep this checkbox boolean pattern consistent with the Chemical Probes and Protein Information summary widgets.
David was the one who recommended the wording to make it clear that it is a selection of species out of the X number of species in Ensembl.
As for the title, I would like to keep it as Gene tree
- that is what we currently use in the Platform and I want to make it easy for users to find the data using titles they are already familiar with. And given the conversation in #538, we are not using all of the homology data and so I am hesitant to relabel it Homology
, unless we change the underlying data.
@andrewhercules, I have just had a chat with @d0choa and it sounds like knowing the number of homologues per species (ordered linearly by species similarity to human) is probably more useful than knowing whether there are 1-to-1, 1-to-many orthologues or human paralogues. If we do also show the latter as well, then it'd be more helpful to know the number of homologues for each type (across the 12 species).
I'll try both designs out and we can perhaps discuss at a front-end meeting.
As @deniseOme pointed out, having information that could be used in preclinical stages is probably one of the main purposes of this widget.
I would anticipate that users might care about (by order of importance):
Hi @d0choa @peatroot @andrewhercules. We do have people at GSK using this for safety studies. Whatever we change, it may worth bringing these users on board earlier on (rather than later) so that we continue to address their needs.
TBH, I'd not have this changed at all. Perhaps adding some links out. Nothing else. E.g. linking out to the orthologues or paralogues in Ensembl from the pop up box below:
In the past, we used to provide the ENS gene IDs for each of those orthologues/paralogues. I'd have this back on in place, and hyperlinked so that those users who want to explore more can do so in the original source Ensembl.
Why don't we run some usability or UX design on this? Provide a sheet of exercises to understand what they want (without actually asking them directly what they want).
According to @iandunham, "the use case for paralogues in safety is knowing whether there are potentially other targets that might provide the function of the target you are trying to drug. If there is 1 copy in human but 2 in rat safety testing for instance might be misleading if your drug is specific only for the one copy of the two paralogues." So the 1-to-many relationships (etc) are indeed important.
I'd also link out to the Ensembl gene tree page from my target profile page e.g. IL2RA would be hyperlinked to take me to either http://www.ensembl.org/Homo_sapiens/Gene/Compara_Tree?db=core;g=ENSG00000134460;r=10:6010689-6062370 or http://www.ensembl.org/Multi/GeneTree/Image?gt=ENSGT00390000018872.
As far as I understand, the discussion is about the information that should be contained in the summary widget. @peatroot correct me if I'm wrong, but the expanded view of the gene tree will be maintained as it is.
I appreciate the points raised by @peatroot and @d0choa and can see the reasons for changing the design of the summary widget to include more data. I know the design spec is not perfect nor 100% ideal. It has shortcomings as it reflects the realities of the data source and our internal project timeline and resource allocations.
That being said, I would strongly recommend that we stick with the design spec that was previously reviewed and agreed with the team for the following reasons:
The summary widgets are to provide a summary of the data and to answer the use case, "Is there X data for my target of interest?". The widget does not need to - and should not - reproduce the underlying data that users can find in the orthology table tab in the detail view (#538). @d0choa, the points you raise are all valid research questions that were explored by my predecessor and can still be answered with the data in the detail view.
The summary widget was deliberately designed to utilise the same boolean design checkbox design pattern used in other widgets. This is to ensure consistency with other widgets, that we reuse code that we have already written to speed up this phase of the project, and that we meet the agreed deadline for completion as per @ElaineMcA's project plan.
While we use icons in the Known Drugs summary widget, the context of their use is that widget is different than how we would use them in this widget. In the Known Drugs summary widget, we use labels and numbers to help users identify the icon and the number of drugs with a given modality. It is a single dimension of the data and it is easy to convey with a coloured icon, label, and number. However, in the design proposed by @peatroot, we are showing multi-dimensional data (species type, species similarity, homology type, counts). We should show the homology type as users in target safety wanted the widget to convey what types of orthologues and paralogues are available. As such, it is more complex data to summarise and it goes beyond providing a quick-read summary based on the key use case for the dashboard identified in point 1 above - in fact, it becomes a widget version of the detail view. In theory, I would be okay with this as we could use this table with icons in the detail view (#538). However, and apologies for sounding like a broken record here, but we are focussing on a "like for like rewrite" as agreed with Ian. We need to minimise the amount of extra work we are putting into this page as rewriting the remainder of the Platform looms on the horizon and there is much more complexity with the other pages.
From my understanding, ordering species by similarity will be different depending on the target and that would result in different ordering and different widget designs. However, as mentioned in the meeting where I shared my research findings, users would like widgets that are consistent across all targets and that are found in a consistent spot on the dashboard. This will enable users to quickly scan the data on the dashboard and identify detail views that they would like to explore.
@deniseOme, in terms of the detail view (#538), we will keep the same features that are currently found in the Platform as we are focussed on a "like for like rewrite".
Yes, that's right. There's a separate ticket for the detail/expanded view (https://github.com/opentargets/platform/issues/538).
A few conclusions after the chat we have today @deniseOme, @peatroot, @andrewhercules and me. (@mirandaio might also be interested)
In general terms, the summary widgets go beyond the like-for-like rewrite, as they represent content that was not there in the first place. The amount of information they contain must be succinct and easily interpretable from a user perspective. The widgets shouldn't include too much information, but also they need to be informative, to help to have an overview of the available target information.
In the context of the agreed timeline, there is a hurry to complete widgets at a good pace. However, @peatroot spotted that the information contained in the "gene tree" widget as proposed in this thread might be limited. I agree that the current booleans for the type of homology might not be distinctive of the target. Probably half of the genes would have the same booleans marked.
An alternative version of the widget could have the subset of the most relevant model organisms (including human) and a number representing the number of homologs on each of them. We still don't know if this will work better than the current widget. @peatroot will make a quick implementation to try to resolve this question.
We all agreed we can not do this widget by widget, but we will try to identify if there is any other widget where a minor change would significantly improve the result. We need to get to speed on implementing them, so it would need to be a clear improvement.
Draft with species icons:
The set of species can be easily reduced, via the API, if it's considered too many.
Thanks @peatroot, it looks nice and neat. One question: since we will always be coming from the human gene (our drug target), I'd suspect "human (0) will always be greyed out". If that is true, do we really need to show human?
I tried a version with human separated as <icon> Human paralogues (<count>)
, above the subtitle you see currently, which was renamed Orthologues by species
, but following discussion with @d0choa, changed to the above, as it is more concise.
Here's the other version:
Homologues includes both paralogues and orthologues. For the summary widget it might be enough information. If somebody is interested to know about the type of homology/orthology they can click in the widget.
There might be cases with several hits in human and none in the rest. Several in human and several in others and none in human and some in others (such as the example above). Also the numbers will change from family to family. The image represents a textbook example where there is only 1 copy of the gene in human and only one copy is conserved across a set of organisms. That's not so common for human genes where multiple speciation and duplication events might have happened in the last 1000 million years.
For our users, it will be meaningful because it will contain whether there are other human paralogs (off-target effects) and what model organisms they could potentially use for preclinical studies.
The fixed order based on the species trees should be the next (based on distances to human in "million years ago" from timetree.org):
Looks good @peatroot!
@d0choa, are there any other species that should be included in the list?
I think this is the complete list that you can find inside the widget (12 organisms)
If the question is if we want to expand the list of 12, probably not. It's already a comprehensive list of model organisms. If somebody asks, we could do it, but it's not a priority.
The alternative question is if we want to narrow down the summary widget to only a few organisms (let's say human + 3) and aggregate the rest as "Other". That would depend on how small we want the widget to be. We can probably take this decision once we have the overall view of all widgets.
It looks really nice!
Thanks @peatroot for sharing the initial version, which I'd have voted for as it seems concise enough to me and has an important extra piece of information available upfront (and missing in the selected version) i.e. the distinction between orthologues and paralogues in the same box/widget.
I wonder if our users are as aware as we are that our homologues = orthologues + paralogues. I'd have thought that they are not aware of the distinction as they do not tend to be evolutionary biologists or evolutionary geneticists.
We can address this by means of help documentation and monitor if users at workshops or via support complain/or ask what the difference is.
I can foresee people asking why we greyed out the widget in human, for those rare examples where there is one copy only of that gene in the human genome. With the orthologue/paralogue distinction up front, this (possible) question would not be asked. They would no that by being greyed out, there is no paralogue.
p.s. for example I have been asked at a workshop at CRUK Therapeutics in London this year "why do we show (1) always in page like this as the drug table there is always for 1 target only.
Use Cases
Summary Views
Full-size version
Design and Interaction Notes
Orthologues and paralogues for GENE-SYMBOL that have been identified across a selection of 12 different species
Please replace GENE-SYMBOL with the HGNC gene symbol value.
Also, please colour the widget container box outline in Open Targets Grey -
#5a5f5f
.The options in the table are:
1 to 1 orthologues
1 to many orthologues
human paralogues
If there is data about a specific type of orthologue or paralogue available, please colour the box containing the FontAwesome checkmark icon in Open Targets Purple -
#7b196a
and please colour the checkmark icon white.If data about a specific type of orthologue or paralogue is not available, please replace the checkmark with the the FontAwesome times icon in Open Targets Light Grey -
#e2dfdf
. Also, please colour the text for that orthologue or paralogue type in Open Targets Light Grey -#e2dfdf
.For targets where this no data about orthologues or paralogues (e.g. HOTAIR or AL138921.2), please display all text (including the FontAwesome times icon) and the widget container box outline in Open Targets Light Grey -
#e2dfdf
.When a user hovers over the summary widget and the target has either a small molecule and/or an antibody tractability assessment, please show the
pointer
icon and change the box outline to Open Targets Purple -#7b196a
. This will provide users with a visual cue that the summary widget is clickable. For more information, please see issue #429.Design Assets
ticket updated on 4 April 2019