TT-Mars: Structural Variants Assessment Based on Haplotype-resolved Assemblies.
cd TT-Mars
. Python >= 3.8 is preferred.conda create -n ttmars
and conda activate ttmars
.dowaload_files.sh
to download required files to ./ttmars_files
.download_asm.sh
to download assembly files of 10 samples from HGSVC.conda install -c bioconda pysam
, conda install -c anaconda numpy
, conda install -c bioconda mappy
, conda install -c conda-forge biopython
, conda install -c bioconda pybedtools
.run_ttmars.sh
includes more instructions. Users can run it to run TT-Mars after setting up.The main program: run python ttmars.py -h
for help.
python ttmars.py output_dir files_dir centro_file vcf_file reference asm_h1 asm_h2 tr_file num_X_chr
output_dir
: Output directory. files_dir
: Input files directory. ./ttmars_files/sample_name
. The directory where you store required files after running dowaload_files.sh
. centro_file
: provided centromere file. vcf_file
: callset file callset.vcf(.gz). reference
: referemce file reference_genome.fasta. asm_h1
: assembly files assembly1.fa, which were downloaded after running download_asm.sh
. asm_h2
: assembly files assembly2.fa, which were downloaded after running download_asm.sh
. tr_file
: provided tandem repeats file. num_X_chr
: if male sample: 1; if female sample: 2.-n/--not_hg38
: if reference is NOT hg38/chm13 (hg19).
-p/--passonly
: if consider PASS calls only.
-s/--seq_resolved
: if consider sequence resolved calls.
-w/--wrong_len
: if count wrong length calls as True.
-g/--gt_vali
: conduct genotype validation.
-i/--gt_info
: index with GT info. (For phased callsets)
-d/--phased
: take phased information. (For phased callsets)
-v/--vcf_out
: output results as vcf files (tp (true positive), fp (false positive) and na).
-f/--false_neg
: output recall, must be used together with -t/--truth_file
.
-t/--truth_file
: input truth vcf file, must be used together with -f/--false_neg
.
ttmars_combined_res.txt: | SV index | relative length | relative score | validation result | chr | start | end | Type | Genotype Match |
---|---|---|---|---|---|---|---|---|---|
0 | 1.0 | 3.48 | True | chr1 | 249912 | 249912 | INS | True |