rsemeraro / XomeBlender

Generates synthetic cancer genomes with different contamination level and intra-tumor heterogeneity and devoid of any synthetic element
GNU General Public License v3.0
10 stars 2 forks source link

Xome-Blender 3.1

Xome-Blender is a collection of python and R scripts based on SAMtools functions that allows to generate synthetic cancer genomes with user defined features such as the number of subclones, the number of somatic variants and the presence of CNV, without the addition of any synthetic element. It is composed of two modules: InXalizer and Xome-Blender. The first module is devoted to the blending process initialization. It takes as input a single BAM file, a set of user-defined parameters and returns the coverage of the sample and the input-files for the second module (Xome-Blender). Optionally, it creates a file containing the coordinates to insert CNV in the final product. The second module generates the synthetic heterogeneous sample.

Supported on Linux.

Requirements

Installation

Docker image

Usage

First run InXalizer. It requires a BAM file, a label for it and a reference genome. By means of four parameters it is possible to tune the initialization process:

  1. Subclone number = the number of subclones that will compose the final product (-scn).
  2. Variants number = the number of somatic variants that will appear in the final product (-vn).
  3. Subclonal architecture = the evolution model for the sample synthesis, it can be Linear or Branched (-sa).
  4. CNV = it's an option that allows for the generation of a CNV file, defining their number and length (-cnv).

Optionally, it is possible to use a target file, in bed format, to edit only defined portions of the BAM file (whole-exome or target sequencing experiments).

Examples:

Examples:

Alternatively, it can be run in "automated mode" by using the --list (-l) option. The activation of this parameter requires a list file, containing the info to run multiple consecutive analyses.

  python3 xome_blender -l list_file.txt

The list_file is a tab separated file containing different anlayses (one per line). Each row must contain all the parameters above.

List_file example:
  NA18501_Control.bam NA18501_Subclone1.bam NA18501 145 20 80   120 Subclone1.vcf
  NA18501_Control.bam NA18501_Subclone1.bam NA18501 145 30 70   90  Subclone1.vcf
  NA18501_Control.bam NA18501_Subclone1.bam NA18501 145 40 60   140 Subclone1.vcf   my_label_CNV.txt
  NA18501_Control.bam NA18501_Subclone1.bam NA18501 145 50 50   50  Subclone1.vcf   my_label_CNV.txt

Contacts

This program has been developed by Roberto Semeraro, Department of Experimental and Clinical Medicine, University of Florence