fslaborg / fslaborg.github.io

The fslab website
https://fslab.org
MIT License
11 stars 15 forks source link

Add tutorial: Replicate Quality Control #16

Closed HLWeil closed 3 years ago

HLWeil commented 3 years ago

Summary

I would like to add a tutorial about how to perform a quality control of sample replicates using:

  1. Deedle for reading a frame containing the data
  2. & 3. FSharp.Stats to impute missing values and cluster the samples
  3. CyJS.NET to visualize the results

This tutorial is meant as a kind of protocol. @kMutagene should this go into the advanced or the data science category?

kMutagene commented 3 years ago

@HLWeil I think this fits better into the advanced category.

bvenn commented 3 years ago

I don't know what you are planning to do but I think there are 3 common ways to determine the sample replicability:

I'm curious about your visualisation using CyJS.NET 🚀

Update: Sorry, it was not planned to close the issue 😅

HLWeil commented 3 years ago

Further improve tutorial quality:

Code quality

let imputedFrame = 
    let rowKeyMap = rawFrame.RowKeys |> Seq.indexed |> Map.ofSeq
    let columnKeyMap = rawFrame.ColumnKeys |> Seq.indexed |> Map.ofSeq
    Frame.ofJaggedArray imputedData
    |> Frame.mapRowKeys (fun r -> rowKeyMap.[r])
    |> Frame.mapColKeys (fun c -> columnKeyMap.[c])

should be replaced with

let imputedFrame = 
    Frame.ofJaggedArray imputedData
    |> Frame.indexRowsWith rawFrame.RowKeys
    |> Frame.indexColsWith rawFrame.ColumnKeys

might this

// Function for flattening the cluster tree to an edgelist
let hClustToEdgeList (f : int -> 'T) (hClust : HierarchicalClustering.Cluster<'T>) =
    let rec loop (d,nodeLabel) cluster=
        match cluster with
        | HierarchicalClustering.Node (id,dist,_,c1,c2) ->
            let t = f id
            loop (dist,t) c1
            |> List.append (loop (dist,t) c2)
            |> List.append [nodeLabel,t,d] 
        | HierarchicalClustering.Leaf (_,_,label)-> [(nodeLabel,label,d)]
    loop (0., f 0) hClust

or something similar be something for FSharp.Stats, @bvenn?

Spelling errors