Open cthunes opened 8 months ago
In addition to the Boolean output issue, there also may be an error in the logic:
dat[, chit := mean(hitc[hitc %in% 0:1]) >= 0.5, by = list(aeid, chid)]
In the example below, there were four spids for a unique aeid/chid. The mean was 0.25 (1/4 hits positive) and therefore would not be considered active. My thinking was that if there was any single active spid/aeid/chid, then the tcplsubsetchid() would by default capture this as an active? Could use min() instead of mean() if this is the appropriate logic. If we want 1/4 hits to be considered inactive, then no change needed.
From 'invitrodb' 1/3/2024: mc5 <- tcplPrepOtpt(tcplLoadData(lvl=5,type='mc', fld='aeid',val=2506)) dat <- mc5[chid == 20006] dat[,hitc2 := ifelse(hitc >= 0.9,1,0)] # 1 hit out of 4 spids mc5.sub <- tcplSubsetChid(dat) mc5.sub$hitc #FALSE, i.e. not considered active despite 1/4 hits
tcplSubsetChid overwriting hitc is causing an issue in the initial data pulls with v4.2 QC new.mc5 <- tcplPrepOtpt(tcplSubsetChid(tcplLoadData(lvl=5, type = 'mc', add.fld = TRUE)))
Try updating hitc as actc in lines 101(mc) and 147(sc)
https://github.com/USEPA/CompTox-ToxCast-tcpl/blob/dev/R/tcplSubsetChid.R#L101
dat[, hitc := hitc >= .9]
Is this behavior desired or should it be in a new column like hitc_bool?