Open zedomel opened 3 years ago
Hello,
In this case, to make the dataset useful later without making the data input form too complicated, I suggest to find a way of using average values (maybe accompanied by the sample number and a measure of variance??) for traits like conspecificPollenGrainsQuantity and fruitSet when in multiple numberOfFlowersVisited or numberOfAvailableFlowers. What do you think? What would be simpler to code and to use afterwards?
Cheers, Coquinho
Em seg., 31 de mai. de 2021 às 18:35, José Augusto Salim < @.***> escreveu:
There are some problems with term that defined single visit and multiple visits conditions. We need to specify the scopes where these terms can be applied.
- What is the relationship of single visit terms with the interaction being documented?
- What is the relationship of multiple visits terms with the interaction being documented?
Definitions of single visit and multiple visits can be taken from @bjpergamo comment on #64 https://github.com/BioComp-USP/rebipp-data-standard/issues/64:
Single visit terms Single visit measurements could be assigned to the interaction between one animal species and one plant species. Usually, such measurements are conducted when one wants to estimate the "effectiveness" of an animal species as a pollinator of a plant species. In other words, how many pollen grains an animal species deposits in a single visit (and consequently, how many pollen tubes, number of fruits, seeds and so on). Such measurements fit well our database. The problem is that they are quite rare because it is very time consuming to conduct effectiveness approaches. I would say we can keep those terms as they are very adequate for our database and we can create all corresponding "Single visit" terms.
According to this definition measurements of:
- the number of conspecific pollen grains deposited on the stigma(s) of the flower
- the number of heterospecific pollen grains deposited on the stigma(s) of the flower
- pollen tubes quantity
- number of fertilized ovules
- number of pollen grains removed from the flower
- fruit set
- fruit mass
- seed set
- seed mass
all terms apply to a single flower which the Interaction was recorded, and so, we know exactly which flower the definitions are talking about (on the stigma(s) of THE FLOWER, THE FLOWER = the flower which the animal visited). So in that case (single visit) the terms can be added as the Interaction properties (we are documenting a specific interaction).
Multiple visit terms Multiple visit terms would make more sense if we treat them as plant traits (as we are treating flower color, nectar, etc). This because they cannot be assigned to the interaction between one animal species and one plant species. They are likely the product of multiple interactions (multiple animals and one plant species). I am comparing with the other plant traits because we also do not measure all specific flowers in a plant that interacted with the animal. We measure some flowers in that population to gain insight about average (and variability) of plant traits that we suspect are important in defining plant-pollinator interactions. Similarly, we measure reproductive success in some flowers of that population to gain insight about average (and variability) of the consequences of plant-pollinator interactions. Depending on the study, interactions, traits and reproductive success may be measured in the same plant individuals (but never in the same flowers) however, it is not always the case.
In this case (multiple visits) all the terms above do not apply to the Interaction being documented. Semantically if we add these terms to the Interaction we will saying that that specific Interaction between an animal occurrence and a plant occurrence yields these measurements (e.g. fruit set, pollen grains deposited), but as @pjbergamo https://github.com/pjbergamo said, it is not what we want to mean here. So, the definitions of these terms in the case of multiple visitors do not have a specific flower, consequently we can not know which flowers the definitions are talking about (THE FLOWER(s) means the flower(s) of specific plant individual or flowers samples from a population (without individual linkage?). Treating these terms similar as traits will require that we add them as properties of plant occurrence (plant traits). But, those terms are related with the the number of flowers visited which is a property of the Interaction. So let's see some examples: interactionID (eventID) animal plant numberOfFlowersVisited conspecificPollenGrainsQuantity fruitSet evt_1 Ceratina asunciana Pfaffia tuberosa 12 200 8
What can we know from the interaction in the example above and the definition of the terms?
-
It is an interaction (with a unique identifier evt_1) between Ceratina asunciana and Pfaffia tuberosa, and the animal(s) visited 12 (distinct?) flowers of the plant. Although, we don't have how to know if these 12 flowers were exposed to multiple visitors or just single visitors. Then, from the actual definitions of:
conspecificPollenGrainsQuantity: The number of conspecific pollen grains deposited on the flower's stigma(s) exposed to multiple visitors at the end of flower anthesis
fruitSet: Proportion of the flowers exposed to floral visitors that yielded fruits
Both terms are applied to the multiple visits case, and so, we should interpret the example as:
- The individual(s) represented in the plant occurrence (Pfaffia tuberosa) received a total of 200 pollen grains on an undefined/unknown number of flowers/number of visits. Despite we know that this pollen load is due multiple visits, the numberOfFlowersVisited is a property of the Interaction, and so, we can not assert that the conspecificPollenGrainsQuantity applied to those 12 flowers. It can applies to a different number of flowers which received multiple visits (including other animal species, not documenting in this particular interaction). The same applied to fruitSet, as we don't know if the number of flowers being considered to get the number of mature fruits is the same number of flowers visited by the animal documented in this particular interaction. So, putting all this together this record should be interpreted as: An interaction (with a unique identifier evt_1) between Ceratina asunciana individual(s) and Pfaffia tuberosa individual(s), where the individual(s) represented in the animal dwc:Occurrence visited 12 flowers of the individual(s) represented in the plant dwc:Occurrence. Also, the plant individual(s) received a total of 200 conspecific pollen grains in an unknown number of flowers, and, an unknown number of flowers set 8 mature fruits.
If this all the information that we are trying to encode here, than I would say that it is fine to treating those terms as traits, despite some confusion that may occurs when using similar terms (e.g cospecificPollenGrainsQuantitySingleVisit vs. cospecificPollenGrainsQuantity) as properties of the plant occurrences ( traits) or the interaction. Otherwise, if the encoded information is wrong, then we should review the usage and the definition of these terms.
Somethings that (perhaps) we are missing:
- A term for the number of flowers where the measurements of pollen grains deposited, pollen tubes, removed pollen grains, etc. are taken
- A term for the total number of visits which a plant received considering all animal visitors (same and different animal species; interactions). This is a number that can be calculated by summing all the numberOfVisits of all Interactions which a particular individual is participating (all the interactions of the same plant dwc:Occurrence). But what about the case where these two numbers are different? SUM( numberOfVisits) != total number of visits which a plant received? Are all visits always recorded/counted, so we always can use the SUM, instead of have a new term for specify the total number of visits...?
A just created a diagram so we can see better what is going on and discuss more about it:
[image: interactions_TermsReview] https://user-images.githubusercontent.com/7684564/120212332-39ca0580-c208-11eb-833f-ccd7e61488c5.png coespecificPollenGrainsQuantity
@zedomel https://github.com/zedomel wrote on #41 https://github.com/BioComp-USP/rebipp-data-standard/issues/41 :
multiple visitors implies multiple Interactions, so it is a 1-to-many relationship with Interaction. The term is linked to one or more Interactions or none at all (if the Interaction is not being recorded, just the coespecific pollen grains in exposed flower).
If we treat this term like a trait then this comment does not apply anymore to multiple interactions (1-to-many), it will be just a 1-to-1 with the plant dwc:Occurrence without be linked to any interaction (it will not be able to know which particular interactions are responsible for the pollen load, but we know that all the recorded interactions for the same plant occurrence are partially responsible for that pollen load if all those interactions happened within exposed flowers (flowers that received multiple visits).
@pjbergamo https://github.com/pjbergamo wrote on #41 https://github.com/BioComp-USP/rebipp-data-standard/issues/41:
I would say that for most cases, none at all. Single visit and multiple visit are distinct processes. Single visit measurements are concerned on measuring the effectiveness of one pollinator species on a plant species (in other words, how many pollen grains each pollinator that interact with a flower species deposit onto stigmas). This would be a one to one relationship. We currently do not have any term for this situation, and unfortunately this data is extremely rare as it is very time consuming to obtain it. This term was proposed only for multiple visits situation, as an overall measurement of pollination success of a flower.
From the sentence single visit measurements are concerned on measuring the effectiveness of one pollinator species on a plant species we have that for single visit the animal species responsible to the pollen deposition is important and must be recorded. But from [...] multiple visits situation, as an overall measurement of pollination success of a flower is just about pollination and not pollinator, so I presume that the animal species is not relevant here. Additionally, this particular sentence pollination success of A FLOWER, so I should really consider that as success of A FLOWER or A PLANT? Since the individual is the plant and the amount of pollen deposited on the stigma(s) has impact on the individual's fitness, why recording it at higher granularity (A FLOWER)? Maybe we already have a solution for that by defining the term to be the total number of conspecific pollen grains, where total applies to all flowers (at least those flowers considered by the measurement) of an individual in a particular interaction.
@cepnunes https://github.com/cepnunes wrote on #41 https://github.com/BioComp-USP/rebipp-data-standard/issues/41:
To keep this term comparable among different studies and somehow keep it of general use, I suggest to change the definition to: "The quantity of conspecific pollen grains deposited on the flower's stigma(s) exposed to multiple visitors at the end in a given period of flower anthesis." That would require the addition of another field with the time during which the flower was exposed to visitors or pollinators. This time could or could not be the same of the entire flower anthesis. But it would allow the inclusion of more studies as it would not be restricted only to studies that collected after the end of anthesis.
It eliminates the problem of setting a specific time when the measurement have to be taken. But it does not solve the problem of recoding single visit vs. multiple visits as exposed above. seedSet
Definition: The total number of seeds of a mature fruit exposed to a single flower visitor
The definition makes clear that is about JUST ONE FRUIT AND JUST ONE FLOWER!. From the explanation above this term should be used as a property of the Interaction (single visit). But since it is about a single fruit (A MATURE FRUIT), it has a 1-to-many relationship with the Interaction. So one interaction may have one or more seedSets (one for each mature fruit of a flower exposed to a single visitor).
So it cannot be used as a single trait for the plant occurrence. It should be a 1-to-many relationship with the plant occurrence, so one dwc:Occurrence will have one or more seedSets (one for each mature fruit of a flower exposed to a single visitor).
@cepnues:
I don't know if it would help or just make the data insertion form too more complicated, but I believe seedSet as well as other terms (listed by @pjbergamo https://github.com/pjbergamo) could be split in at least three boxes: one with the absolute number of the trait measures, one with the number of flowers observed for this entry, and other with a time measures (which could vary from minutes to a whole flowering season). As an example for seedSet, we could have something like [243] seed(s) per [1] fruit(s) per [1] flowering season. In this case, the number of seeds would be a number as well as the other two boxes. The box with the time measure would then have to be followed by another box with controlled vocabulary if we want to include time measurements other than a single flowering season.
@pjbergamo https://github.com/pjbergamo:
For synthesis purposes, the important data are: average, variation (best always standard deviation) and number of samples. If it is clear that one should complete the database with raw data (i.e. each seed set data measured for each individual instead of an average seed set for the population), all these data can then be retrieved afterwards.
seedSet and multiple visits?
@cepnunes https://github.com/cepnunes:
like in some other terms (like #37 https://github.com/BioComp-USP/rebipp-data-standard/issues/37 ), I believe that this term only make sense when in relationship to another measurement, such as total amount of pollen per flower. Moreover, I believe very few studies present this data, but with the methods for counting pollen grains becoming easier and more accessible, there will be more studies with this information in the future.
removedPollenGrainQuantity
@anselmoeco https://github.com/anselmoeco:
Another point is that the studies that measure this descriptor make the data available by flower, or by groups of anthers of the same flower (e.g. larger or smaller anthers), or even by anther. So it would be good to avoid this misunderstanding in the definition of that term.
@pjbergamo https://github.com/pjbergamo:
The issues here can be solved as the issues raised in conspecific pollen grains on stigmas, pollen tubes, etc. Here, the relevant data would be the percentage left or removed (just chose one and go with it) - e.g. 40% pollen grains removed (or 60% remained) in a flower. My suggestion is to keep it at flower level (in other words, pollen removal estimate coming from all anthers, which would cover the majority of the cases - in some flowers it is just impossible to estimate pollen removal anther per anther).
fruitSet
Proportion ?
2.3) Fruit set (multiple visits) Fruit set is not related to the number of flowers exposed to animal visitors. Usually, fruit set is measured by marking a subset of the flowers available (it is very hard to mark all available flowers, especially in trees or shrubs with multiple branches) and, from these subset of flowers one will then count how many set fruit.
2.4) fruitMass, seedSet, seedMass These measurements are not proportions and should be treated as cospecificPollenGrains (and the other pollen-related reproductive success variables: pollen tubes, fertilized ovules and so on).
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/BioComp-USP/rebipp-data-standard/issues/65, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALHPRBAWHT4RDRTP6YRGTYLTQPCGJANCNFSM453H2YGA .
-- Carlos E. P. Nunes Postdoctoral researcher Biological and Environmental Sciences University of Stirling Research Gate: http://www.researchgate.net/profile/Carlos_Nunes4 C.V. Lattes: http://lattes.cnpq.br/8852339408561488
@cepnunes thank you for contributing.
Honestly, I'm not a fan of aggregated measurements, since there are not raw data we will miss a lot of information. But if this proved to be the best solution that we can figure out at this moment, then I will not opposed to that.
However, even using aggregated measurements does not solve the problem of which of the entities (Interaction
or Occurrence
) these terms (conspecificPollenGrains
, fruitSet
) should be linked with:
conspecificPollenGrains
, fruitSet
or any other reproductive success term to particular Interaction
means that these values (average our raw values) are specific of that particular Interaction
. One Interaction
is not limited to an interaction between just two individuals. It can be an interaction between two groups of taxonomic homogenous organisms, as shown in the Figure in my last comment. Occurrence
means that these values (average or raw) are specific of a particular Occurrence
. Since one particular Occurrence
can represent an occurrence of multiple taxonomic homogeneous organisms (dwc:organismQuantity
) these values will represent measurements considering the individuals in the Occurrence
. Despite these measurements are strictly related to interactions, since some interaction has to happen so we can have fruitSet
, conspecificPollenGrains
, etc. measurements, doing it we will lost any information about which interactions (including the pollinator's information) are responsible for the reproduction success of the individuals in the Occurrence
.I don't know which solution fit best the most common use cases. This is something that we have to figure out by investigating different use cases and examples of datasets.
And as you already have mention, we also should search for a solution which makes the data input/digitization simple in the user's perspective, even if some abstraction can be developed on top of that by the Information Systems.
thanks.
Hi @zedomel,
I am also not a fan of the aggregated measurements. Is it possible to give the user the option of inserting all the values of each measurement (in different cells or separated by commas, for instance)? That would solve this issue.
Regarding the other issue. I believe that linking the reproductive success terms to the plant Occurrence will be fine in terms of keeping the information there useful for further studies.
Cheers, Carlos
There are some problems with term that defined single visit and multiple visits conditions. We need to specify the scopes where these terms can be applied.
Definitions of single visit and multiple visits can be taken from @bjpergamo comment on #64:
According to this definition measurements of:
all terms apply to a single flower which the
Interaction
was recorded, and so, we know exactly which flower the definitions are talking about (on the stigma(s) of THE FLOWER, THE FLOWER = the flower which the animal visited). So in that case (single visit) the terms can be added as theInteraction
properties (we are documenting a specific interaction).In this case (multiple visits) all the terms above do not apply to the
Interaction
being documented. Semantically if we add these terms to theInteraction
we will saying that that specificInteraction
between an animal occurrence and a plant occurrence yields these measurements (e.g. fruit set, pollen grains deposited), but as @pjbergamo said, it is not what we want to mean here. So, the definitions of these terms in the case of multiple visitors do not have a specific flower, consequently we can not know which flowers the definitions are talking about (THE FLOWER(s) means the flower(s) of specific plant individual or flowers samples from a population (without individual linkage?). Treating these terms similar as traits will require that we add them as properties of plant occurrence (plant traits). But, those terms are related with thethe number of flowers visited
which is a property of theInteraction
. So let's see some examples:What can we know from the interaction in the example above and the definition of the terms?
It is an interaction (with a unique identifier
evt_1
) between Ceratina asunciana and Pfaffia tuberosa, and the animal(s) visited12
(distinct?) flowers of the plant. Although, we don't have how to know if these12
flowers were exposed to multiple visitors or just single visitors. Then, from the actual definitions of:conspecificPollenGrainsQuantity
:The number of conspecific pollen grains deposited on the flower's stigma(s) exposed to multiple visitors at the end of flower anthesis
fruitSet
:Proportion of the flowers exposed to floral visitors that yielded fruits
Both terms are applied to the multiple visits case, and so, we should interpret the example as:
200
pollen grains on an undefined/unknown number of flowers/number of visits. Despite we know that this pollen load is due multiple visits, thenumberOfFlowersVisited
is a property of theInteraction
, and so, we can not assert that theconspecificPollenGrainsQuantity
applied to those12
flowers. It can applies to a different number of flowers which received multiple visits (including other animal species, not documenting in this particular interaction). The same applied tofruitSet
, as we don't know if the number of flowers being considered to get the number of mature fruits is the same number of flowers visited by the animal documented in this particular interaction. So, putting all this together this record should be interpreted as: An interaction (with a unique identifierevt_1
) between Ceratina asunciana individual(s) and Pfaffia tuberosa individual(s), where the individual(s) represented in the animaldwc:Occurrence
visited12
flowers of the individual(s) represented in the plantdwc:Occurrence
. Also, the plant individual(s) received a total of200
conspecific pollen grains in an unknown number of flowers, and, an unknown number of flowers set8
mature fruits.If this all the information that we are trying to encode here, than I would say that it is fine to treating those terms as traits, despite some confusion that may occurs when using similar terms (e.g
cospecificPollenGrainsQuantitySingleVisit
vs.cospecificPollenGrainsQuantity
) as properties of the plant occurrences (traits) or the interaction. Otherwise, if the encoded information is wrong, then we should review the usage and the definition of these terms.Somethings that (perhaps) we are missing:
numberOfVisits
of allInteraction
s which a particular individual is participating (all the interactions of the same plantdwc:Occurrence
). But what about the case where these two numbers are different? SUM(numberOfVisits
) != total number of visits which a plant received? Are all visits always recorded/counted, so we always can use the SUM, instead of have a new term for specify the total number of visits...?A just created a diagram so we can see better what is going on and discuss more about it:
coespecificPollenGrainsQuantity
@zedomel wrote on #41 :
If we treat this term like a trait then this comment does not apply anymore to multiple interactions (1-to-many), it will be just a 1-to-1 with the plant
dwc:Occurrence
without be linked to any interaction (it will not be able to know which particular interactions are responsible for the pollen load, but we know that all the recorded interactions for the same plant occurrence are partially responsible for that pollen load if all those interactions happened within exposed flowers (flowers that received multiple visits).@pjbergamo wrote on #41:
From the sentence single visit measurements are concerned on measuring the effectiveness of one pollinator species on a plant species we have that for single visit the animal species responsible to the pollen deposition is important and must be recorded. But from [...] multiple visits situation, as an overall measurement of pollination success of a flower is just about pollination and not pollinator, so I presume that the animal species is not relevant here. Additionally, this particular sentence pollination success of A FLOWER, so I should really consider that as success of A FLOWER or A PLANT? Since the individual is the plant and the amount of pollen deposited on the stigma(s) has impact on the individual's fitness, why recording it at higher granularity (A FLOWER)? Maybe we already have a solution for that by defining the term to be
the total number of conspecific pollen grains
, wheretotal
applies to all flowers (at least those flowers considered by the measurement) of an individual in a particular interaction.@cepnunes wrote on #41:
It eliminates the problem of setting a specific time when the measurement have to be taken. But it does not solve the problem of recoding single visit vs. multiple visits as exposed above.
seedSet
Definition:
The total number of seeds of a mature fruit exposed to a single flower visitor
The definition makes clear that is about JUST ONE FRUIT AND JUST ONE FLOWER!. From the explanation above this term should be used as a property of the
Interaction
(single visit). But since it is about a single fruit (A MATURE FRUIT), it has a 1-to-many relationship with theInteraction
. So one interaction may have one or moreseedSet
s (one for each mature fruit of a flower exposed to a single visitor), otherwise we will limit the interactions to always recorded at flower level (one interaction = one animal visiting one flower).So we have (at least) two options:
Interaction
orInteraction
(but how?)@cepnues wrote on #37:
This can be a solution for change the term from 1-to-many to 1-to-1. But will it cover all use cases?
@pjbergamo wrote on #37:
Yes. We should look for raw data (not aggregated/summarized data) and in that case 1-to-many can give us more detailed information. However, we can define it to be the
total number of seeds in the fruit set
. What do you think? Do we have this strict relation betweenfruitSet
andseedSet
(seeds are counted for the fruits in the seed set - always)removedPollenGrainQuantity
Definition:
The total number pollen grains removed from the anther(s) of a flower
@cepnunes wrote on #36 :
@anselmoeco wrote on #36:
@pjbergamo:
At the flower level means 1-to-many with the plant occurrence?
fruitSet
Definition:
Proportion of the flowers exposed to floral visitors that yielded fruits
Can we use the number of flowers that set fruit instead of the proportion. A proportion is not the data actually measured, it is calculated from other two measures: the number of flowers marked and the number of flowers that set fruit.
@pjbergamo wrote on #34:
So, do we need a term for documenting the
number of marked flowers
(the subset of the flowers available)? If the fruit set is not related to the number of flowers exposed to animal visitors it must be spelled out on the definition. The way it is now, it makes a clear statement that fruit set is aboutthe flowers exposed to floral visitors
, so it is related to the flowers exposed to animal visitors by definition!In @pjbergamo explanation we have Usually, fruit set is measured by marking a subset of the flowers. Is is a subset of the flowers of a given individual or flowers from a population? One thing is to say that a
fruitSet
is specific of particular individual(s) (dwc:Occurrence
withdwc:organismQuantity
>=1
), another thing is to say that thefruitSet
is specific to flowers sampled from a plant population without the relateddwc:Occurrence
(s). In the first case the fruitSet is linked to the Occurrences, in the last it is not linked to anything (it is just a decontextualized measurement).@pjbergamo also wrote on #34:
I'm assuming that he is talking about multiple visits terms, and so,
fruitMass
,seedSet
andseedMass
should be treated astraits
. I have exposed some problems with that above: 1-to-many or 1-to-1.