Advice on term re-use (owl:Class versus owl:DatatypeProperty)

ISA-tools / stato

This is the development repository for the STATistics Ontology (STATO). For more information and demonstration on the ontology content, please visit its website:

http://stato-ontology.org/

30 stars 8 forks source link

Advice on term re-use (owl:Class versus owl:DatatypeProperty) #38

Closed cmaumet closed 5 years ago

cmaumet commented 9 years ago

Hi @proccaserra, @agbeltran,

I was wondering if you could help us finding out what would be the best option to re-use some of your STATO terms defined as owl:Class when in NIDM-Results we use the same concept but defined as an owl:DatatypeProperty.

An example of such a term is obi:FWER adjusted p-value that is defined as owl:Class. In NIDM-Results we are using the same "FWER p-value" concept but as an owl:DatatypeProperty (see for e.g. nidm:pValueFWER in nidm:ExtentThreshold) in order to be able to associate a value with this term (e.g. the "FWER adjusted p-value" used as a threshold). (I know this term is currently part of OBI but I think we might encounter the same issue for other STATO terms).

While the definitions are nearly identical, given that the owl types are different, I can't really think of what would be the best approach to re-use your term in NIDM. Would have any advice on this? Any feedback or pointer would be greatly appreciated.

Thank you in advance.

proccaserra commented 9 years ago

@cmaumet , my first thought was that it should not be used as DataType property if declared as Class. So I was wondering if adding a restriction on nidm:ExtentThreshold to specify what this threshold 'is about' would help. nidm:extentThreshold ro:is_about obi:FWER adjusted p-value prov:value "0.5"^^xsd:float ;

which allows distinguishing the value set for the Threshold from the actual values actually taken by the FWER adjusted p-value.

I was looking at example 23 (2.3.4.9 nidm:HeightThreshold). The other option would be to define a specific owl:DatatypeProperty 'FWER adjust p-value extent threshold', distinguishing it from OBI Class and restricted its domain to the OBI class.

my 2 cents.

nicholst commented 9 years ago

@nicholsn, @satra, would value your $0.02 on this too.

On Mar 23, 2015, at 4:31 PM, Philippe Rocca-Serra notifications@github.com wrote:

@cmaumet , my first thought was that it should not be used as DataType property if declared as Class. So I was wondering if adding a restriction on nidm:ExtentThreshold to specify what this threshold 'is about' would help. nidm:extentThreshold ro:is_about obi:FWER adjusted p-value prov:value "0.5"^^xsd:float ;

which allows distinguishing the value set for the Threshold from the actual values actually taken by the FWER adjusted p-value.

I was looking at example 23 (2.3.4.9 nidm:HeightThreshold). The other option would be to define a specific owl:DatatypeProperty 'FWER adjust p-value extent threshold', distinguishing it from OBI Class and restricted its domain to the OBI class.

my 2 cents.

— Reply to this email directly or view it on GitHub.

nicholst commented 9 years ago

Also, possibly useful for this discussion: NIDM's 'Thresholds' are types of a "significance level"(http://purl.obolibrary.org/obo/STATO_0000053), i.e. are critical values on statistical tests of some sort. Thus our "thresholds" could be derived from "significance levels" such that they are specific to different types of test statistics and control different variants of (multiple testing) false positive rates.

The problem, though, is that STATO's "significance level" is an alias for 'false positive rate' (a data item which accounts for the proportion of incorrect rejection of a true null hypothesis) which concerns a single hypothesis test.

I perhaps should start another issue, but, briefly, when considering 2 or more hypothesis tests, there is no unique measure of "false positive". FWER and FDR are two examples.

Thus perhaps we should define other false positive rates and then significance levels for more general types of error rates.

cmaumet commented 9 years ago

@proccaserra: thanks a lot for your comment!

I like your first suggestion but, unfortunately, a nidm:HeightThreshold entity will typically have more than one p-value (e.g. a 'FWER corrected p-value' and an 'uncorrected p-value' and potentially a 'FDR-corrected p-value', example 23) which makes it a bit difficult...

Starting from a simplified example of nidm:HeightThreshold:

    niiri:my_height_threshold a nidm:HeightThreshold ;
                nidm:pValueFWER "0.05"^^xsd:float .
                nidm:pValueUncorrected "7.62e-07"^^xsd:float ;
                prov:value "5.23"^^xsd:float . # corresponding statistic value

Option 1: use `prov:value` to specify the p-value

So, if we were to use the first option, we would have to create a collection of nidm:HeightThreshold entities (one for each type of p-value) and somewhat connect them together, like this:

    niiri:my_height_threshold_1 a nidm:HeightThreshold ;
                ro:is_about obi:FWER adjusted p-value
                prov:value "0.05"^^xsd:float . 
    niiri:my_height_threshold_2 a nidm:HeightThreshold ;
                ro:is_about obi:uncorrected p-value
                prov:value "7.62e-07"^^xsd:float . 
    niiri:my_height_threshold_3 a nidm:HeightThreshold ;
                ro:is_about stato:statistic
                prov:value "5.23"^^xsd:float . 
    niiri:my_height_threshold_1 is_equivalent_to niiri:my_height_threshold_2 ;
                                is_equivalent_to niiri:my_height_threshold_3 ;

Option 2: create object properties in nidm and link to obi classes

Now, looking at your second solution, we would have to link each of our p-values Object Properties (nidm:pValueUncorrected, nidm:pValueFWER, nidm:pValueFDR) to a new entity of type obi:FWER adjusted p-value, obi:FDR-corrected q-value..., like this:

    niiri:my_height_threshold_1 a nidm:HeightThreshold ;
                nidm:pValueFWER niiri:my_fwer_p_value .
                nidm:pValueUncorrected niiri:my_uncorrected_p_value^^xsd:float ;
                prov:value "5.23"^^xsd:float . # corresponding statistic value
    niiri:my_fwer_p_value a obi:'FWER p-value' ;
                prov:value "0.05"^^xsd:float . 
    niiri:my_uncorrected_p_value a obi:'uncorrected p-value' ;
                prov:value "7.62e-07"^^xsd:float .

Does these example correspond to what you had in mind? Or am I missing something? The issue I can see with those is that we end up with a much more verbose document that previously... Would you have any guidance on how to decide when a Class versus a Data Property versus an Object Property should be created?

Sorry for my long reply... We really appreciate your feedback!

satra commented 9 years ago

@cmaumet - this indeed does get tricky. so i'll step back and look at this as a lay person. what we are really talking about are different thresholds. in all cases, the threshold is computed off the input by some multiplecorrection activity.

input (tmap, dof) or input (pvals)	activity (multiple comparison correction step - FWER, FDR, etc.,.)

output (threshold per activity type)

what we are trying to do is to describe threshold entities to capture the activity here. just to make sure i understand this properly, we will never have nidm:pValueFWER and nidm:pValueFDR as properties of the same threshold, right?

so now going back to the task at hand. so option 1 appears correct and cumbersome! option 2 is going overboard (i think). personally, i would prefer @cmaumet simplified example subject to the response to my question below.

@proccaserra - what is the rationale behind obi:FWER adjusted p-value being a class rather than a dataTypeProperty. it seems that FWER might be an instance of a class called "MultipleComparisonCorrectionMethods", rather than the p-value produced by such a method being a class.

nicholsn commented 9 years ago

@proccaserra - what is the rationale behind obi:FWER adjusted p-value being a class rather than a dataTypeProperty. it seems that FWER might be an instance of a class called "MultipleComparisonCorrectionMethods", rather than the p-value produced by such a method being a class.

@satra brings up a question I have here too for @proccaserra which is how would an instance document looks from a pure OBI perspective for reporting the float value for a FWER adjusted p-value instance?

After taking a look at the datatype properties in OBI:

The has_feature_value stands out, which has a domain of Mathematical Feature, but this is under the occurrent branch of the ontology and alongside other processes like ANOVA, and Statistical Hypothesis Test

Each of these processes point back to data items like FWER adjusted p-value, but I don't understand the preferred method for reporting actual values.

I like Satra's input --> activity --> output simplification, which makes me think that FWER belongs under a data transformation and height threshold is a data item...

In that case is there really any need to specify the type of height threshold if you can infer the type of value based on the activity?

This is just a thinking outloud example mashing up prov and obi

:fwer a obi:FWER, prov:Activity ;
    prov:used :some_input ;
    obi:has_specified_input :some_input ;
    obi:has_specified_output :fwer_output .

:fwer_output a obi:data_item, prov:Entity, nidm:HeightThreshold;
    prov:wasGeneratedBy :fwer ;
    prov:value 0.03 .

# other stats activities and values
# link them together
:my_collection a prov:Collection ;
    prov:hadMember :fwer_output, :pvalue .

nicholst commented 9 years ago

Hi folks, @proccaserra, @agbeltran,

I feel at this point it might be useful for me to enumerate the various types of "multiple testing error rates" that one would want to control, that would make up "MultipleComparisonCorrectionMethods". I've included the must frequently used methods, but an exhustive list can be found in the FDR wikipedia page.

Per-Comparison Error Rate: The Per-Comparison Error Rate (PCER) is the expected proportion of false positives in a family of tests. Testing each hypothesis individually at level alpha guarantees that PCER is controlled at alpha, and corresponds to testing without any correction for multiplicity.
False Discovery Rate: The False Discovery Rate (FDR) is the expected False Discovery Proportion (FDP), where the FDP describes a family of tests, and is the random, unobservable proportion of false positive among the tests with rejected null hypotheses; if there are no null hypotheses are rejected, FDP is defined as 0. Control of FDR implies control of FWER in the weak sense.
Positive False Discovery Rate: The Positive False Discovery Rate (pFDR) is the expected False Discovery Proportion (FDP) conditional on one or more null hypotheses being rejected, where the FDP describes a family of tests, and is the random, unobservable proportion of false positive among the tests with rejected null hypotheses.
Local False Discovery Rate: The Local False Discovery Rate (lFDR) is the the Bayesian posterior probability that a null hypothesis is true computed for a family of tests.
Familywise Error Rate: The Familywise Error Rate (FWER; also FWE) when not otherwise specified, is generally understood to be Familywise Error Rate in the strong sense.
Familywise Error Rate in the strong sense: The Familywise Error Rate in the strong sense is the probability of one or more false positives in a collection or "family" of tests. FWER in the strong sense requires that this probability is considered for all possible configurations of true and false null hypotheses, and never exceeds the desired nominal level. FWER control in the strong sense implies FWER control in the weak sense.
Familywise Error Rate in the weak sense: The Familywise Error Rate in the weak sense the probability of one or more false positives in a collection or "family" of tests. FWER in the weak sense only requires that this probability is computed when all null hypotheses are assumed to be true. Control of FWER in the weak sense does not imply FWER control in the strong sense.

(Note, the definitions above are based, in varying degrees, on the FDR wikipedia page).

And finally, it is worth defining what a "corrected" or "adjusted" P-value. See e.g. wikipedia page on P-values (though that page's primary use of "inflated" is a usage I've never heard, and I'm something of an expert on multiple testing).

First, for reference, from OBI's p-value

p-value: A quantitative confidence value that represents the probability of obtaining a result at least as extreme as that actually obtained, assuming that the actual value was the result of chance alone.

A statistician like me would rather use the more precise definition (i.e. would append this to the OBI definition)

p-value: A p-value is the smallest alpha-level at which a given dataset and statistical test will reject the null hypothesis.

Then I would propose

corrected p-value: A corrected or (adjusted) p-value is a generalisation of p-values to the setting of multiple testing. For a particular multiple testing error rate, given dataset consisting of a family of statistical tests, a corrected p-value for one element of the family is the smallest error-rate that will reject the null hypothesis at that element.

Note this is consistent with OBI's FWER adjusted p-value.

Finally, finally, I note that these multiple testing concepts are hardly unique to brain imaging, and you'll find these terms used throughout bioinformatics and genetics, as well.

nicholst commented 9 years ago

One final follow up:

The ubiquitous and OBI-defined Bonferroni correction method is just one statistical procedure that guarantees control of the FWER.

Likewise, the OBI Benjamini and Hochberg false discovery rate correction method, what people generally use when they say "FDR", is just one of many procedures that is guaranteed to control FDR.

proccaserra commented 9 years ago

@nicholst @satra @cmaumet @nicholsn working on OBI (few years back), we indeed created classes to cover a number of methods for multiple testing correction applied in the field of functional genomics (mainly gene expression signal). see children of http://purl.obolibrary.org/obo/OBI_0200089: we also defined the "specified_output" of these methods, names 'FWER adjusted p-value' and 'q-value' (with a synonym: 'FDR adjusted p-value' )

So yes to @nicholst , we could add a 'adjusted p-value' parent class. Regarding the different 'rate', again this seems indeed needed and those would be an 'specified input' to the multiple correction method.

@cmaumet @satra @nicholst , would you also then ask to have those rate defined as 'datatype properties'

We cast p-value as classes and currently, we would use a datatype property 'has_value' to provide the float value .

We may need to define a more specify datatype_property with domain set to p-values (adjusted or not) and range to be float.

It is heavier that the options you proposed and it may be time to revisit that part of OBI by running examples through the models. What is unclear to me at the minute is the domain you would define for pvalue if it were cast as a datatype property

Many thanks for your time and input!

cmaumet commented 9 years ago

Sorry for the delay in my reply.

@satra: while in practice this will probably not be the case, I think that we could have both nidm:pValueFWER and nidm:pValueFDR as properties of the same threshold (e.g. to provide the equivalent FDR- and FWE- pvalues for a given uncorrected p-value). The different p-values in a given threshold represent equivalent representations of the same threshold. One of them was set by the user and the other ones are computed. I agree with you that there is a hidden activity in there... I think that our point was to ease querying.

@proccaserra: we are not currently representing the rates explicitly so it is a bit difficult to answer your question. I would say that, with our current model, we would only use the corresponding p-value term for each of these rates.

@proccaserra: To better understand how the OBI model works, could you let us know how the "suitably define Type I error rate" mentioned in the definition is set for a given multiple testing correction method? I think this would be really close to what we are doing with the nidm p-value terms in height-threshold.

Thank you!

satra commented 9 years ago

@proccaserra - sorry for the delay, for me the domain of p-value would be any statistical test or basically anything that has a corresponding pdf (probability density function).

cmaumet commented 8 years ago

@proccaserra: thank you for your suggestion at https://github.com/incf-nidash/nidm/pull/374#issuecomment-184444371. I concur that stato:"study group population size" seems to be a very good match!

But, as I was about to replace nidm:"number of subjects per group", I realised that stato:"study group population size" is defined as a class while nidm:"number of subjects per group" is a data property. This reminded me of this open discussion and I thought we might want to revive it in view of this new example.

Let me summarise:

In NIDM-Results, we would like to store a list of groups along with their sample size (and name) and relate it to the (already defined) Data entity.

In https://github.com/incf-nidash/nidm/pull/374, this is currently modelled as:

data_id a nidm:Data
data_id prov:wasAttributedTo group_id

group_id a stato:'study group population'
group_id rdfs:label "Control Group"
group_id nidm:'number of Subjects' 18

To use stato:'study group population size', my understanding is that we would have to do:

data_id a nidm:Data
data_id prov:wasAttributedTo group_id

group_id a stato:'study group population'
group_id rdfs:label "Control Group"

group_population_size_id obo:'is about' group_id
group_population_size_id a stato:'study group population size'
group_population_size_id obo:has value (or prov:value) 18

@proccaserra: could you let me know if this example is consistent with what you would typically do in OBO?

As I understand STATO (like OBO) tries to minimise the number of data properties (almost everything is a class) while we (in NIDM-Results) have created a number of Data properties (which makes the overall data model more condensed). Sorry for my ignorance but is the focus on classes a way to keep the ontology as generic as possible? Would you say it is a good practice to avoid having multiple data properties?

Thank you in advance for your feedback!

proccaserra commented 8 years ago

@cmaumet I don't think our way is better over what you propose, it is true that having to rely on a class rather than a data property makes things more verbose. I'll differ to @agbeltran in case she's knows of a better way to represent that piece of information. (the ignorance is all mine :) !) this is in fact an interesting point of discussion, thx for reviving it. all the best.

cmaumet commented 8 years ago

@proccaserra: thanks a lot for your feedback! It's great to be able to discuss this! I was looking around and found this Property Reification Vocabulary that might help us here. Using it, we could relate nidm's 'number Of Subjects' data property to stato's 'study group population size' class:

nidm:NumberOfSubjectReification 
          a prv:PropertyReification;
          prv:shortcut nidm:'number of Subjects';
          prv:reification_class stato:'study group population size';
          prv:subject_prop obo:'is about';
          prv:object_prop obo:'has value';

This would be a way to use 'number of subjects' as a property in nidm but still describe it in relation to the 'study group population size' class in STATO in our owl file (and our html spec). @agbeltran: does this sound reasonable? Thank you!

agbeltran commented 8 years ago

Hi @cmaumet, thanks a lot for following up this thread.

First of all, here is the way (I know, extremely verbose) I would represent the size of 'study group population':

###type defintions
group_id a stato:'study group population'.
group_size_id a stato:'study group population size'.
value_specification_id a obi:'value specification'.
###triples to relate the group and its size, via a 'value specification'
group_id obi:'has quality' group_size_id.
group_size_id obi:'has value specification' value_specification_id.
value_specification_id obi:'has specified value' 18.

The reason we (and the OBO ontologies) are so verbose is to be able to identify and type every element (it avoids the need to reify the statements at the cost of adding many triples, so a shortcut approach would be simpler but with no typing information).

As regards the Property Reification Vocabulary, my understanding is that is about typing the triple rather than typing the subject and object elements, which are represented by rdf:Property. So, I see your nidm:NumberOfSubjectReification as a reification of a CountingEvent rather than a reification of stato:study group population size.

nicholsn commented 8 years ago

@agbeltran thanks for the example. This is a clear example of how OBI goes about modeling values. Do you know if it is possible to use the Property Reification Vocabulary for modeling a value for a class like stato:study group population size?

ISA-tools / stato

Advice on term re-use (owl:Class versus owl:DatatypeProperty) #38

Option 1: use prov:value to specify the p-value

Option 2: create object properties in nidm and link to obi classes

Option 1: use `prov:value` to specify the p-value