sgkit-dev / sgkit-requirements

Repo for collecting requirements.
MIT License
6 stars 1 forks source link

[WIP] User Story - Exploratory Statistics #5

Open jerowe opened 4 years ago

jerowe commented 4 years ago

Before the PCA or PBS a scientist would complete several rounds of exploratory statistics and filtering.

  1. Look at distributions of the data
  2. Filter variants based on quality
  3. Subset genotypes based on previous variant statistics
  4. Filter samples (missingness)
  5. Subset genotypes based on previous sample statistics

Reference the Breaking down the PCA PR

jerowe commented 4 years ago
  1. General exploratory statistics

  2. Variants Quality

  3. Filter Genotypes based on Variant Quality

  4. (and 5) Explore Sample Statistics

jerowe commented 4 years ago

@daletovar this is shared between the PCA and PBS user stories.