ryanlayer / samplot

Plot structural variant signals from many BAMs and CRAMs
MIT License
511 stars 67 forks source link

no plottable samples with matched alignment files #164

Open Duda5 opened 1 year ago

Duda5 commented 1 year ago

I generated a vcf file with sniffles (on PacBio HiFi reads aligned to hg38) and I am trying to visualize SVs using samplot vcf (conda installation). vcf file stats:

Len          Del    Dup Inv INS TRA UNK
0-50bp          3326    0   0   3605    162 0
50-100bp    3471    1   0   4094    0   0
100-1000bp  5493    4   14  8393    0   0
1000-10000bp    1180    16  28  1534    0   0
10000+bp    72  20  15  15  0   0

samplot command I used: samplot vcf --vcf HG033_sniffles_SVs.vcf -d . --sample_ids HG033_sniffles -b HG033_hg38_mnp2.bam --plot_all But the output is only an empty index.html file.

When I include a --debug flag, there are 2 recurring types of errors: 1) For INS - INS type not supported I have seen this error in other raised issues and if understand it correctly, samplot does not visualize SNPs 2) For DEL (of any size) - no plottable samples with matched alignment files What can be causing that?

Also, I do not understand why longer insertions and other types of SVs are not visualized.

mchowdh200 commented 1 year ago

Insertions are not currently supported by Samplot, but we are in the planning stages of implementing them. As for the deletions, can you try plotting some individual regions from the VCF with samplot plot?

mchowdh200 commented 1 year ago

Also, based on the error, check the sample name in the VCF and see if it matches the sample name in the BAM

Duda5 commented 1 year ago

Thanks for prompt reply, @mchowdh200! The sample name in the the VCF file was correct, but I noticed that there was no @RG tag in my bam file. I used samtools addreplacerg to add it back and it works now!

However, I have another issue. I prepared the .gff3 file and indexed it as recommended (but I used Homo_sapiens.GRCh38.107.gff3.gzversion as I aligned my reads to hg38).

So, I have the annotation column in the index.html file: image But none of the images actually contain the gene track. For examples this deletion that has an overlap column value gene: image

I supplied sorted gff3 file via --gff3 flag, so I am not sure why the track is not displayed below the coverage plot.

mchowdh200 commented 1 year ago

From the usage string, I saw this description for the --gff3 option

                      used when building HTML table and table filters
                      (default: None)

This leads me to believe that it's used for some sort of table filtering and not for plotting the track. In samplot plot we have the -T option for plotting the gff track underneath the samples, but it doesn't seem to be part of samplot vcf yet.

pontushojer commented 1 year ago

This leads me to believe that it's used for some sort of table filtering and not for plotting the track. In samplot plot we have the -T option for plotting the gff track underneath the samples, but it doesn't seem to be part of samplot vcf yet.

It is actually possible to add additional samplot plot arguments to samplot vcf that will be passed on to each samplot plot call. For the command above it would be something like:

samplot vcf --vcf HG033_sniffles_SVs.vcf -d . --sample_ids HG033_sniffles -b HG033_hg38_mnp2.bam --plot_all --gff3 Homo_sapiens.GRCh38.107.gff3.gz -T Homo_sapiens.GRCh38.107.gff3.gz