Hello,

I'm using SVJedi to call genotypes based on structural variants determined de novo by Sniffles (I'm actually more interested in the allele frequency since my organism is haploid). I have many samples for which I'd like to call genotypes based on the same input vcf file, so I was interested in using SVJedi option's to put multiple read files. I used the option -i file1.fasta file2.fasta

By doing so, I was expecting to get 1 column per sample in the VCF file with the results organized according to the FORMAT column (GT:DP:AD:PL). Instead, I get only 1 SAMPLE column with results. Am I misunderstanding what multiple files should represent? Are the reads considered altogether, no matter the file they come from? If this is the case, it would be a nice feature to be able to run multiple samples at the same time, but I can always run SVJedi on each sample separately and parse the files afterwards.

Also, the last line of the header (column headers) does not seem to correspond to the content of the columns in the rest of the file: it seems to be a copy of the last line of the header in the input vcf file, to which are added FORMAT SAMPLE at the end. For example, in your file Data/HG002_son/expected_genotype_results.vcf the header contains 12 fields (see below) but there are only 10 columns in the rest of the file.

CHROM POS ID REF ALT QUAL FILTER INFO FORMAT HG002 FORMAT SAMPLE

Thank you for developing SVJedi and thank you for your help,

Hugo

llecompte / SVJedi

How does svjedi work with multiple read files? #10

CHROM POS ID REF ALT QUAL FILTER INFO FORMAT HG002 FORMAT SAMPLE