MueFab / mpegg-performance-benchmark

MPEG-G Performance Benchmark
MIT License
3 stars 0 forks source link

Define use cases & metrics #2

Open voges opened 3 years ago

voges commented 3 years ago

Use cases:

Metrics:

voges commented 3 years ago

For every use case, define a number of test cases. Each test case is coupled to specific data. Check whether that data is available in the MPEG-G Genomic Information Database, or whether it needs to be generated.

voges commented 3 years ago

Preprocessing for aligned data

For each SAM file:

  1. Clean SAM file: remove data that is in scope of ISO/IEC 23092-3, i.e., data that is out of scope of ISO/IEC 23092-2. (A script needs to be written for this task.)
  2. Categorize into i) files with multiple alignments and ii) files without multiple alignments
  3. Sort by (reference sequence and) mapping position
PaoloRibeca commented 3 years ago

Use cases