BoevaLab / FREEC

Control-FREEC: Copy number and genotype annotation in whole genome and whole exome sequencing data
153 stars 49 forks source link

feature request: Ignore chromosomes not in caputre file instead of failing #106

Open FriederikeHanssen opened 2 years ago

FriederikeHanssen commented 2 years ago

Thanks a lot for providing this tool! We often analyse capture data with controlfreec were not all chromosomes from the fai are present in the capture kit. ControlFreec then fails with:

Command output:
  Control-FREEC v11.6 : a method for automatic detection of copy number alterations, subclones and for accurate estimation of contamination and main ploidy using deep-sequencing data
  Multi-threading mode using 2 threads
  ..consider the sample being male
  ..Breakpoint threshold for segmentation of copy number profiles is 1.2
  ..telocenromeric set to 50000
  ..FREEC is not going to adjust profiles for a possible contamination by normal cells
  ..Coefficient Of Variation set equal to 0.05
  ..it will be used to evaluate window size
  ..Output directory:   .
  ..Directory with files containing chromosome sequences:       /Chromosomes
  ..Sample file:        tumor.mpileup
  ..Sample input format:        pileup
  ..Control file:       normal.mpileup
  ..Input format for the control file:  pileup
  ..forceGCcontentNormalization was set to 1: will use GC-content to normalize the read count data
  ..minimal expected GC-content (general parameter "minExpectedGC") was set to 0.35
  ..maximal expected GC-content (general parameter "maxExpectedGC") was set to 0.55
  ..Polynomial degree for "ReadCount ~ GC-content" normalization is 3 or 4: will try both
  ..Minimal CNA length (in windows) is 3
  ..File with chromosome lengths:       Homo_sapiens_assembly38.fasta.fai
  ..File Homo_sapiens_assembly38.fasta.fai was read

Command error:
  For example, you can remove chromosome HLA-DRB1*11:01:01 from yourHomo_sapiens_assembly38.fasta.fai
  Error: chromosome HLA-DRB1*11:01:02 present in your Homo_sapiens_assembly38.fasta.fai file was not detected in your file with capture regions capture-regions.bed
  Please solve this issue and rerun Control-FREEC

This means that we need to adapt the fai or chromosome files everytime, which is not ideal in a workflow/automatic processing setting.

Ideally, we can have a fixed set of reference files and the above error would rather lead to an exclusion of that chromosome.