igv-reports - A Python application to generate self-contained HTML reports for variant review and other genomic applications. Reports consist of a table of genomic sites and an embedded IGV genome browser for viewing data for each site. The tool extracts slices of data for each site and embeds the data as blobs in the HTML report file. The report can be opened in a web browser as a static page, with no depenency on the original input files.
igv-reports requires Python 3.8 or greater.
pip install igv-reports
igv-reports requires the package pysam version 0.22.0 or greater, which should be installed automatically. However, on
OSX this sometimes fails due to missing dependent libraries. This can be fixed following the procedure below, from the
pysam docs;
"The recommended way to install pysam is through conda/bioconda.
This will install pysam from the bioconda channel and automatically makes sure that dependencies are installed.
Also, compilation flags will be set automatically, which will potentially save a lot of trouble on OS X."
conda config --add channels r
conda config --add channels bioconda
conda install pysam
Reports are created with the command line script create_report
, or
alternatively python igv_reports/report.py
. Command line arguments
are described below. Although --tracks is optional, a typical report will include at least an alignment track
(BAM or CRAM) file from which the variants were called.
Arguments:
Required
The arguments begin, end, and sequence are required for a generic tab delimited sites file.
Optional coordinate system flag for generic tab delimited sites file only
false
.Optional
url
and indexURL
properties
should be set to the paths of the respective files.<script>
tags in the page.BASE, STRAND, INSERT_SIZE, MATE_CHR, and NONE
. Default value is BASE
for
single nucleotide variants, NONE
(no sorting) otherwise. See the igv.js documentation for more information.--exclude-flags 0
. See samtools documentation for more
details.--idlink 'https://www.ncbi.nlm.nih.gov/snp/?term=$$'
samtools view
documentation
for more detailsTrack file formats:
Currently supported track file formats are BAM, CRAM, VCF, BED, GFF3, GTF, WIG, and BEDGRAPH. FASTA. BAM, CRAM, and
VCF
files must be indexed. Tabix is supported and it is recommended that all large files be indexed.
Data for the examples are available in the github repository https://github.com/igvteam/igv-reports. The repository can be downloaded as a zip archive here https://github.com/igvteam/igv-reports/archive/refs/heads/master.zip. It is assumed that the examples are run from the root directory of the repository. Output html is written to the examples directory.
create_report test/data/variants/variants.vcf.gz \
--fasta https://igv-genepattern-org.s3.amazonaws.com/genomes/seq/hg38/hg38.fa \
--ideogram test/data/hg38/cytoBandIdeo.txt \
--flanking 1000 \
--info-columns GENE TISSUE TUMOR COSMIC_ID GENE SOMATIC \
--samples reads_1_fastq \
--sample-columns DP GQ \
--tracks test/data/variants/variants.vcf.gz test/data/variants/recalibrated.bam test/data/hg38/refGene.txt.gz \
--output example_vcf.html
echo bed
create_report test/data/variants/variants.bed \
--genome hg38 \
--flanking 1000 \
--info-columns GENE TISSUE TUMOR COSMIC_ID GENE SOMATIC \
--tracks test/data/variants/variants.bed test/data/variants/recalibrated.bam \
--output example_genome.html
create_report test/data/variants/tcga_test.maf \
--genome hg19 \
--flanking 1000 \
--info-columns Chromosome Start_position End_position Variant_Classification Variant_Type Reference_Allele Tumor_Seq_Allele1 Tumor_Seq_Allele2 dbSNP_RS \
--tracks test/data/variants/tcga_test.maf \
--output example_maf.html
create_report test/data/variants/test.maflite.tsv \
--genome hg19 \
--sequence 1 --begin 2 --end 3 \
--flanking 1000 \
--info-columns chr start end ref_allele alt_allele \
--output example_tab.html
create_report test/data/variants/SKBR3_Sniffles_sv.vcf \
--genome hg19 \
--flanking 1000 \
--maxlen 10500 \
--info-columns SVLEN \
--tracks test/data/variants/SKBR3_Sniffles_sv.vcf https://igv-genepattern-org.s3.amazonaws.com/test/bam/reads_lr_skbr3.sampled.bam \
--output example_sv.html
create_report test/data/variants/SKBR3_Sniffles_tra.bedpe \
--genome hg19 \
--flanking 1000 \
--tracks test/data/variants/SKBR3_Sniffles_variants_tra.vcf test/data/variants/SKBR3.ill.bam \
--output example_bedpe.html
create_report test/data/variants/variants.vcf.gz \
--genome hg38 \
--flanking 1000 \
--info-columns GENE TISSUE TUMOR COSMIC_ID GENE SOMATIC \
--tracks test/data/variants/variants.vcf.gz test/data/variants/recalibrated.bam \
--output example_genome.html
create_report test/data/variants/variants.vcf.gz \
--fasta https://igv-genepattern-org.s3.amazonaws.com/genomes/seq/hg38/hg38.fa \
--ideogram test/data/hg38/cytoBandIdeo.txt \
--flanking 1000 \
--info-columns GENE TISSUE TUMOR COSMIC_ID GENE SOMATIC \
--track-config test/data/variants/trackConfigs.json \
--output example_config.html
create_report test/data/variants/1kg_phase3_sites.vcf.gz \
--genome hg19 \
--flanking 1000 \
--tracks test/data/variants/1kg_phase3_sites.vcf.gz test/data/variants/NA12878_lowcoverage.bam \
--idlink 'https://www.ncbi.nlm.nih.gov/snp/?term=$$' \
--output example_idlink.html
create_report test/data/junctions/Introns.38.bed \
--genome hg38 \
--type junction \
--track-config test/data/junctions/tracks.json \
--info-columns TCGA GTEx variant_name \
--title "Sample A" \
--output example_junctions.html
create_report test/data/fusion/igv.fusion_inspector_web.json \
--fasta test/data/fusion/igv.genome.fa \
--template igv_reports/templates/fusion_template.html \
--track-config test/data/fusion/tracks.json \
--output example_fusion.html
create_report test/data/wig/regions.bed \
--genome hg19 \
--exclude-flags 512 \
--tracks test/data/wig/ucsc.bedgraph test/data/wig/mixed_step.wig test/data/wig/variable_step.wig \
--output example_wig.html
info-columns-prefixes
option. Variant track only, no alignments. (Example output)python igv_reports/report.py test/data/annotated_vcf/consensus.filtered.ann.vcf \
--genome hg19 \
--flanking 1000 \
--info-columns cosmic_gene \
--info-columns-prefixes clinvar \
--tracks test/data/annotated_vcf/consensus.filtered.ann.vcf \
--output example_ann.html
--exclude-flags
option to include duplicate alignments in report by specifying a samtools --exclude-flags
value. Default value is 1536 which filters duplicates and vendor-failed reads.create_report test/data/dups/dups.bed \
--genome hg19 \
--exclude-flags 512 \
--tracks test/data/dups/dups.bam \
--output example_dups.html
-no-embed
option to use external URL references for tracks in the report.create_report test/data/variants/variants.vcf.gz \
--genome hg38 \
--no-embed \
--tracks https://igv-genepattern-org.s3.amazonaws.com/test/reports/variants.vcf.gz https://igv-genepattern-org.s3.amazonaws.com/test/reports/recalibrated.bam \
--output example_noembed.html
The script create_datauri
(python igv_reports/datauri.py
) converts the contents of a file to a data uri for
use in igv.js. The datauri will be printed to stdout. NOTE It is not neccessary to run this script explicitly to
create a report, it is documented here
for use with stand-alone igv.js.
Convert a gzipped vcf file to a datauri.
create_datauri test/data/variants/variants.vcf.gz
Convert a slice of a local bam file to a datauri.
create_datauri --region chr5:474,969-475,009 test/data/variants/recalibrated.bam
Convert a remote bam file to a datauri.
create_datauri --region chr5:474,969-475,009 https://1000genomes.s3.amazonaws.com/phase3/data/NA12878/alignment/NA12878.mapped.ILLUMINA.bwa.CEU.low_coverage.20121211.bam