Closed mvarewyck closed 6 months ago
@SanderDevisscher How do we exclude the males? The question is mostly about records which have unknown type_comp
. For the other types we automatically select the females.
(1) retain records with geslacht_comp != "mannelijk"
-> there can still be records retained that have gender unkown and are actually males. So we might have too many records with type_comp onbekend in the countEmbryos plot
ecoData <- loadRawData(type = "eco")
allSpecies <- unique(ecoData$wildsoort)
sapply(allSpecies, function(iSpecies) {
filterData <- ecoData[ecoData$type_comp == "Onbekend" &
ecoData$wildsoort == iSpecies &
ecoData$geslacht_comp != "Mannelijk", ]
table(filterData$geslacht_comp)
})
# Wild zwijn Edelhert Damhert Ree
# Vrouwelijk 28 0 8 133
# Mannelijk 0 0 0 0
# Onbekend 615 1 10 873
(2) retain records with geslacht_comp == "vrouwelijk"
-> we exclude way too many records, because there are many records with unknown gender that still have known type
> table(droplevels(ecoData$type_comp[ecoData$geslacht_comp == "Onbekend"]))
Smalree Jaarlingbok Reegeit Reebok Onbekend
10 6 166 58 1499
(3) exclude records with geslacht_comp == "mannelijk" OR (geslacht_comp == "unknown" & type_comp == "unknown"
. We might have excluded some female records. so too little records with type_comp unknown in the countEmbryos plot
So I think the decision is between (1) and (3) depending on whether you want to retain or exclude the ones for which you don't know gender AND type. Or do I miss sth?
I would go for the 3rd option. Explicit male individuals and fully unknown (no sex & no type) should be excluded.
Option 2 indicates we need to add some logic to check whether these are in fact correct and ifso reverse engineer the sex based on the type in the Backoffice.
Describe the bug When filtering the data within
countEmbryos()
there is a bug retaining some records that havetype_comp == 'Onbekend'
andgeslacht_comp == "Mannelijk"
. Issue only occurs for the records with unknown type as we filter on the female types otherwise.To Reproduce
Expected behavior Exclude the male species within
countEmbryos()
Git SHA (after 0.3.1)
7568c97e249da29bc34f3581c2c549d45a14777f