fslaborg / FSharp.Stats

statistical testing, linear algebra, machine learning, fitting and signal processing in F#
https://fslab.org/FSharp.Stats/
Other
205 stars 54 forks source link

Filter nan before qvalue calculations #255

Closed bvenn closed 1 year ago

bvenn commented 1 year ago

When determining qvalues, nan-pvalues are converted to nan-qvalues, which is correct. However, the overall number of pvalues is determined without filtering nan from the input set, which leads to an underestimation of the FDR.

bvenn commented 1 year ago

quickfix for reference:

let getQvalueFromNanCorrupted (inputPvals:float[]) = 

    let pvalValid =
        inputPvals 
        |> Array.indexed   
        |> Array.filter (fun (i,p) -> not (nan.Equals p))

    let indicesOfNan = 
        inputPvals 
        |> Array.indexed   
        |> Array.filter (fun (i,p) -> (nan.Equals p))
    // if all pvalues are none, just return the nans
    if pvalValid.Length = 0 then 
        nan,indicesOfNan |> Array.map snd

    else 

        let pi0 = Testing.MultipleTesting.Qvalues.pi0Bootstrap (pvalValid |> Array.map snd)
        let qv = Testing.MultipleTesting.Qvalues.ofPValuesBy pi0 snd pvalValid

        let aggChunks =
            Array.map2 (fun (i,pval) qval -> i,qval) pvalValid qv
            |> Array.append indicesOfNan
            |> Array.sortBy fst
            |> Array.map snd
        pi0,aggChunks 
bvenn commented 1 year ago

closed by 5c95836c75b7c02aa65712c142be328fd7b54892