PHI-base / curation

PHI-base curation
0 stars 0 forks source link

‘Suggest a new child term’ link is misleading in some cases #122

Closed CuzickA closed 11 months ago

CuzickA commented 1 year ago

In cases were there is no obvious parent term to suggest a new child term such as the chemistry resistance terms then the suggested new term would really be a sibling term

Eg I want to request a new term 'resistance to chemical X'. I search for 'resistance to' and find no obvious parent term so select any 'resistance to' term and request term as a sibling

image

image

image

I know, but most curators will not, that the parent term is a 'do not annotate' term which is 'Increased resistance to chemical'

image image

Does this matter? Should we add info to the FAQ or help documentation? If we do add supporting text do we state that new term requests can also be made from sibling terms (which should cover most cases) or do we highlight what the correct parent term is for Chemistry curation new term requests (this may still leave the same issue with other branches of terms).

CuzickA commented 1 year ago

@MPiovesana do you have any comments on this?

MPiovesana commented 1 year ago

@CuzickA from my understanding, I think it would be best to add this information to FAQ, as it will certainly be something users will encounter when curating chemistry resistance papers. I think the idea of a sibling term makes sense, and it will be easier for users to understand. In addition to adding text to FAQ, I think it might also be helpful to change the text of the hyperlink in PHI-Canto to say "Suggest a new child or sibling term for PHIPO...". What are your thoughts?

MPiovesana commented 1 year ago

@CuzickA Alternatively, should we suggest to users in FAQ that, in those instances when they cannot locate an appropriate PHIPO term when curating a chemical resistance paper, they should search for "increased/decreased resistance to chemical" and then proceed to suggest a child term? Could we perhaps suggest both as options to suggest new terms, or you think it would be best to pick one?

CuzickA commented 1 year ago

Thanks @MPiovesana

So the options are 1) Change the text of the hyperlink in PHI-Canto to say "Suggest a new child or sibling term for PHIPO..." @jseager7, @ValWood is this possible? 2) Add information to the main User help text that this could be a child or sibling term (hyperlink text would not need to be changed) 3) If we choose option 2, we could have additional support in the FAQs with examples. We can use the chemistry example in the Chemistry curation tips/FAQ section but will probably also need to add text to the general curation tips/FAQs as I expect this issue will not be limited to chemistry PHIPO terms.

MPiovesana commented 1 year ago

@CuzickA Thank you very much for the clarification. If not possible to change the hyperlink text in PHI-Canto, then I agree it would be great to add this info to the User help text and general curation tips, as this will not be an issue faced by users of chemistry papers only, as you mentioned. I'd be happy to start drafting text and adding screenshot examples.

jseager7 commented 1 year ago

I know, but most curators will not, that the parent term is a 'do not annotate' term which is 'Increased resistance to chemical'

@CuzickA Why does it matter that the parent term is marked as 'do not annotate'? Do these terms not show up in the term autocomplete?

Change the text of the hyperlink in PHI-Canto to say "Suggest a new child or sibling term for PHIPO..."

This is possible, but we might need different settings for PHI-base and PomBase, since PomBase (and other databases using Canto) might be fine with the current wording, or they may only want child terms to be added.

I think stating 'child or sibling term' isn't strictly necessary, since a term can't be anything other than those two options. Maybe 'Suggest a new term related to PHIPO:0000001', but that doesn't make the purpose very obvious.

One option is to remove any distinction regarding term type and even the term ID from the text, so it would simply read: "Suggest a new term for PHIPO". I'm not convinced that all curators will understand the distinction between 'child' and 'sibling' – in terms of being descendant from, or adjacent to, the current term – but they will know what term needs to be added. Then it's up to the ontology maintainers to decide where the term should go.

Another option is to add two links, one for each case, for example:

The downsides of this are a) curators might not understand the difference, and b) Canto currently doesn't have any way to indicate which of the options were picked (i.e. whether the suggestion is for a child or sibling), and enabling that will involve more work updating the database, export formats, etc.

To avoid changes to the database, we could cheat by inserting the child or sibling classifier as a label at the start or end the text definition automatically (e.g. [CHILD] or [SIBLING]) after the request is submitted.

MPiovesana commented 1 year ago

@jseager7 Thank you very much for your comments and suggestions. I personally like the idea of having "Suggest a new term for PHIPO" option considering that not all users might be able to understand what a child or sibling term mean, as you said.

Would it be possible to have this as an option without the user first having to select another term first? I say this as, from my perspective as a new curator, I did not find it intuitive that in order to suggest a term, one must first pick an existing term to then suggest a child or sibling. If selecting a term is necessary prior to being able to click on "Suggest a new term", however, I think it is important to explain this clearly on user help text.

CuzickA commented 1 year ago

I know, but most curators will not, that the parent term is a 'do not annotate' term which is 'Increased resistance to chemical'

@CuzickA Why does it matter that the parent term is marked as 'do not annotate'? Do these terms not show up in the term autocomplete?

Actually maybe the 'do not annotate' is not the issue, it is more that it is not clear that the parent term should be 'increased resistance to chemical' and then a child term suggested. I think most curators would search for 'resistance to X' and then the NTR would be for a sibling term.

CuzickA commented 1 year ago

One option is to remove any distinction regarding term type and even the term ID from the text, so it would simply read: "Suggest a new term for PHIPO". I'm not convinced that all curators will understand the distinction between 'child' and 'sibling' – in terms of being descendant from, or adjacent to, the current term – but they will know what term needs to be added. Then it's up to the ontology maintainers to decide where the term should go.

This could work.

Maybe we could add in the user text that [CHILD] or [SIBLING] could be optionally added for those that were familiar with 'child' and 'sibling'.

jseager7 commented 1 year ago

Would it be possible to have this as an option without the user first having to select another term first?

That's a good suggestion. I'll raise this idea with the main developers of Canto.

I did not find it intuitive that in order to suggest a term, one must first pick an existing term to then suggest a child or sibling.

Yes, that's something I hadn't considered. It's indeed possible that some new curators will hesitate to pick a term that's wrong, and never see the term suggestion option. They might just stop making the annotation at that point.

CuzickA commented 1 year ago

I agree.

We could generate some help text/FAQ with detail about ideally selecting the closest/most similar term to the NTR and if this is not possible maybe point them to the very high level PHI phenotype or Single species phenotype level terms.

jseager7 commented 1 year ago

Maybe we could add in the user text that [CHILD] or [SIBLING] could be optionally added for those that were familiar with 'child' and 'sibling'.

@CuzickA By 'user text' do you mean the user documentation / FAQ or directly in the help text on the page?

We don't have to insist on a particular format for the child / sibling tags unless we want to make the term requests machine-readable.

CuzickA commented 1 year ago

Maybe we could add in the user text that [CHILD] or [SIBLING] could be optionally added for those that were familiar with 'child' and 'sibling'.

@CuzickA By 'user text' do you mean the user documentation / FAQ or directly in the help text on the page?

We don't have to insist on a particular format for the child / sibling tags unless we want to make the term requests machine-readable.

Probably a brief mention in the main user documentation and then we could add a more detailed explanation in the FAQs. I hadn't thought about adding it to the help text on the NTR page - that would probably be the easiest place for curators to see it.

ValWood commented 1 year ago

I think the text could change to "suggest a new term" in PomBase too @kimrutherford

kimrutherford commented 1 year ago

Would it be possible to have this as an option without the user first having to select another term first?

That's a good suggestion. I'll raise this idea with the main developers of Canto.

The reason we for the current order is that we want the community curators to try to find a term before they suggest one. If they can get close to the correct term in the right part of the ontology, it helps with checking the sessions.

The bigger problem is that Canto expects an existing term to be selected to use in the annotation, as a place-holder. An annotation without a term would break Canto in quite a few places. So we'd need to put a term in, like "pathogen host interaction phenotype" / PHIPO:0000001.

I'll talk to Val soon about this and get back to you.

ValWood commented 1 year ago

I think we could probably just remove the word 'child" from here, but keep everything else the same:

Screenshot 2023-06-19 at 14 33 10

Quite often the terms suggested are not "child" terms they are siblings, but 'child' is a bit jargony. Usually the curator will get as deep as they can before suggesting.

kimrutherford commented 1 year ago

I think we could probably just remove the word 'child" from here, but keep everything else the same:

That's an easy change.

How do you feel about allowing users to suggest a new term before they select an existing term?

ValWood commented 1 year ago

They should always need to be on a term page to suggest a term? But that term could be a child or a sibling of the term where the suggestions is added. We always say "child" and we encourage drill down. Most users get to the most specific term appropriate , but sometimes it i s not easy for a user to find the most appropriate parent.

Screenshot 2023-06-20 at 07 09 56

So, often they suggest a term and the parent isn't actually the correct parent, but usually it is close and t is easy to figure out where to place in the ontology.

MAybe I am not understanding the request?

ValWood commented 1 year ago

What I am trying to say is that I think what is being suggested is exactly what happens at the moment. Users can and do suggest terms anywhere once they know that they can. There are places, like the one Alayne points out, where it is difficult to locate he parent , so in these cases users usually add the term request to a sibling.

ValWood commented 1 year ago

So, in this case, users don't think to type "sensitive to chemical"

Screenshot 2023-06-20 at 07 18 21

so, if they don't find their chemical with a search, they usually add the term on a sibling

Screenshot 2023-06-20 at 07 18 34

We don't want to discourage "drill down", but we want people to be able to suggest a new term if they are stuck. It doesn't matter so much if they suggest a term which exists, because we will find it when we come to add to the ontology. This is really only used in the cases where a user cant get into the right place in the ontology with a keyword search for some reason. Usually because it is difficult to know the name of the very general parent, or we don't have appropriate synonyms.

jseager7 commented 1 year ago

@ValWood The problem is that the term suggestion can't be selected on the first page of the annotation workflow, before choosing a term:

image

It's a problem specifically for chemical resistance / sensitivity terms: if a curator searches for these terms using the chemical name, and we don't have a term for that chemical, the curator gets no suggestions at all:

image

Maybe the curator then follows the suggestion to "begin with a broad term", for example "resistance". Now they only get terms that are too specific:

image

If the curator selects one of these terms, they can't see the term above it (resistance to chemical, which is the one they really want) because Canto can only stores the child terms for each term, not the parent terms.

The curator would basically have to either guess to use "chemical" as the broad term, or they'd have to look at the PHIPO ontology to find the parent term.

ValWood commented 1 year ago

Yep. It isn't a huge issue because curators can make term suggestions on any term. Although this is a bit non-obvious it's what most of our community do if they can't find the term they need. Although It's a bit of a problem the first time curators The chemistry ones are probably the biggest issue. To improve things a little we could add "resistance to chemical" as the first example broad term. Maybe there is also a way to weight a parent term in the Lucene search so that it appears higher up the results list in cases where there are lots of matches? This would help quite a lot.

kimrutherford commented 1 year ago

Maybe there is also a way to weight a parent term in the Lucene search so that it appears higher up the results list in cases where there are lots of matches?

That's possible in the config file.

James, maybe try this?:

load:
   ontology:
      term_boosts:
          "PHIPO:0000022": 2.0

Which should make it twice as likely for PHIPO:0000022 to be suggested. Hopefully if they type "resistance to ..." it will be a suggestion. If that doesn't work try a bigger number.

Unfortunately this setting requires an ontology reload for each change.

jseager7 commented 1 year ago

@kimrutherford I tested the term boost configuration and it works exactly as described.

Searching for 'resistance' pushes the parent term right to the top of the list of suggestions:

image

I also boosted the parent term for chemical sensitivity and that works too:

image

(Although note that we're inconsistent with our naming for chemical sensitivity terms: the parent term is 'sensitivity to chemical' but the child terms are 'sensitive to chemical'. This isn't consistent with the grammar of the chemical resistance terms either: 'resistance' is a noun but 'sensitive' is an adjective.)

@MPiovesana @CuzickA Do you think this would be an acceptable solution to the problem? That is: when a chemical term is missing, you would suggest a child term of 'increased resistance to chemical' or 'increased sensitivity to chemical' instead of suggesting a sibling to an existing chemical term.

MPiovesana commented 1 year ago

@kimrutherford I tested the term boost configuration and it works exactly as described.

Searching for 'resistance' pushes the parent term right to the top of the list of suggestions:

image

I also boosted the parent term for chemical sensitivity and that works too:

image

(Although note that we're inconsistent with our naming for chemical sensitivity terms: the parent term is 'sensitivity to chemical' but the child terms are 'sensitive to chemical'. This isn't consistent with the grammar of the chemical resistance terms either: 'resistance' is a noun but 'sensitive' is an adjective.)

@MPiovesana @CuzickA Do you think this would be an acceptable solution to the problem? That is: when a chemical term is missing, you would suggest a child term of 'increased resistance to chemical' or 'increased sensitivity to chemical' instead of suggesting a sibling to an existing chemical term.

@jseager7 In my view, that makes the most sense when searching for a parent term before suggesting a new child term. It is what I instinctively would have done as a first time curator (that was my thought process). If the others agree, I think this can help curators to easily find the appropriate parent term.

ValWood commented 1 year ago

I agree, if they can't find with an exact match they shoul see the parent, check the list of descendants and suggest a new child.

CuzickA commented 1 year ago

(Although note that we're inconsistent with our naming for chemical sensitivity terms: the parent term is 'sensitivity to chemical' but the child terms are 'sensitive to chemical'. This isn't consistent with the grammar of the chemical resistance terms either: 'resistance' is a noun but 'sensitive' is an adjective.)

We followed the structure that had been used in FYPO image image

CuzickA commented 1 year ago

@kimrutherford I tested the term boost configuration and it works exactly as described.

Searching for 'resistance' pushes the parent term right to the top of the list of suggestions:

image

I also boosted the parent term for chemical sensitivity and that works too:

image

(Although note that we're inconsistent with our naming for chemical sensitivity terms: the parent term is 'sensitivity to chemical' but the child terms are 'sensitive to chemical'. This isn't consistent with the grammar of the chemical resistance terms either: 'resistance' is a noun but 'sensitive' is an adjective.)

@MPiovesana @CuzickA Do you think this would be an acceptable solution to the problem? That is: when a chemical term is missing, you would suggest a child term of 'increased resistance to chemical' or 'increased sensitivity to chemical' instead of suggesting a sibling to an existing chemical term.

This seems like a good idea. When I try this out I get the same options above for 'sensitivity' but when I type in 'resistance' I do not see the parent term 'increased resistance to chemical' only the child terms for 'resistance to ...'

@MPiovesana, please could you try this out again today to see what you get when you search for 'resistance'?

CuzickA commented 1 year ago

If we get this term boost configuration working will we change the name of the link as discussed earlier in the ticket or leave it as is?

'Suggest a new child term for...' 'Suggest a new term for...'

MPiovesana commented 1 year ago

This seems like a good idea. When I try this out I get the same options above for 'sensitivity' but when I type in 'resistance' I do not see the parent term 'increased resistance to chemical' only the child terms for 'resistance to ...'

@MPiovesana, please could you try this out again today to what you get when you search for 'resistance'?

@CuzickA I have tried this out and I get the same results as you; when I search for "sensitivity", I get prompted with "increased sensitivity to chemical" as the first suggested term, but this is not the case when I search for "resistance".

CuzickA commented 1 year ago

This seems like a good idea. When I try this out I get the same options above for 'sensitivity' but when I type in 'resistance' I do not see the parent term 'increased resistance to chemical' only the child terms for 'resistance to ...' @MPiovesana, please could you try this out again today to what you get when you search for 'resistance'?

@CuzickA I have tried this out and I get the same results as you; when I search for "sensitivity", I get prompted with "increased sensitivity to chemical" as the first suggested term, but this is not the case when I search for "resistance".

Thanks @MPiovesana

@jseager7, do you think this could be related to the changes in the deployment scripts?

CuzickA commented 1 year ago

We will also need to take into consideration how to search for the parent term 'chemical phenotype' for the 'normal growth on chemical X' terms image

jseager7 commented 1 year ago

@CuzickA The changes haven't been enabled on the main server yet, I've only tested them locally. Sorry for any confusion.

CuzickA commented 11 months ago

Hi @jseager7, have these changes been enabled on the main server?

jseager7 commented 11 months ago

They should be now. If you try searching for 'resistance' or 'sensitivity' you should see the parent terms first.

CuzickA commented 11 months ago

They should be now. If you try searching for 'resistance' or 'sensitivity' you should see the parent terms first.

Thanks this works now :-)

CuzickA commented 11 months ago

We will also need to take into consideration how to search for the parent term 'chemical phenotype' for the 'normal growth on chemical X' terms image

This is still an issue. We could write up guidance in the FAQ about selecting 'chemical phenotype' as the parent for new 'normal growth on chemical X' terms. However, at the moment I can't search for and select 'chemical phenotype'

image

Any ideas why I can't select 'chemical phenotype' @jseager7

jseager7 commented 11 months ago

I think the best option here would be to create a grouping term for 'normal growth on chemical' which users would select as the parent term for the normal growth terms. We can boost the 'normal growth on chemical' grouping term in the autocomplete, in the same way that we did for the resistance and sensitivity terms.

Any ideas why I can't select 'chemical phenotype'

Is 'chemical phenotype' in the qc_do_not_annotate or qc_do_not_manually_annotate subset? Or does it appear if you enter the term ID?

CuzickA commented 11 months ago

Is 'chemical phenotype' in the qc_do_not_annotate or qc_do_not_manually_annotate subset?

Yes it is

Or does it appear if you enter the term ID?

Yes I can select it when I enter the Term Id PHIPO:0001218

jseager7 commented 11 months ago

If you do want 'chemical phenotype' to be available for selection then it would have to be removed from the do not annotate subsets. Though I think that the better choice would be add a grouping term for the normal growth terms.

CuzickA commented 11 months ago

Though I think that the better choice would be add a grouping term for the normal growth terms.

I agree

CuzickA commented 11 months ago

@jseager7 new grouping term now made. Just waiting for PHIPO release and loading into PHI-Canto and then if you could apply the search boost as suggested above please.

jseager7 commented 11 months ago

@CuzickA The grouping term should be loaded into PHI-Canto with the term boost applied.

CuzickA commented 11 months ago

Thanks @jseager7, searches now work as expected with the term boost.

Closing ticket as this issue has now been resolved for the chemistry terms using the term boost option.