igvteam / igv-reports

Python application to generate self-contained pages embedding IGV visualizations, with no dependency on original input files.
MIT License
350 stars 52 forks source link

When the bam file (200 Mb) is big, it failed to load #54

Closed XuewenWangUGA closed 3 years ago

XuewenWangUGA commented 3 years ago

I found that there is an issue if the .bam alignment file is 200Mb, the displaying will fail. There is no other error except that the browser complains a slow loading issue. After testing for a while, the result is that loading always fails. Any solution to this? Reduce the sliced bam size or relative small region?

I think the RAM for this computer used in testing is big enough (126 G). As a control, I tested in the IGV browser the same .bam file and it works well.

jrobinso commented 3 years ago

Do you mean the original bam file that is input to the report? I test with a 3.1 GB bam file, the size of the bam file is not relevant to the report, although the depth of coverage might. It might help if you post the command line parameters you used for generating the report. Also, how large is the generated .html file.

XuewenWangUGA commented 3 years ago

The command: create_report examples/variants/variants_short.bed.txt hg38/hg38.fa --flanking 1000 --sequence 1 --begin 2 --end 3 --info-columns MarkerName Chromosome Start End --tracks examples/variants/m64011_210305_012434.hifi_reads.demux.GRCh38.mq255.chr5:150066324-150086375.bam --output CSF1PO.html

The bam "m64011_210305_012434.hifi_reads.demux.GRCh38.mq255.chr5:150066324-150086375.bam" is a subset of bam only for locus "CSF1PO at region chr5:150066324-150086375.bam". The output CSF1PO.html was successfully generated. The bam is only for locus "CSF1PO" and no aligned bam data for other places. The displaying of CSF1PO.html shows ok for other loci where the bam is empty. But for the site CSF1PO, the displaying of the bam track failed.

The same data can be viewed normally in IGV browser. The reference is the same as the example data. It seems that the region was not changed to the expected range after clicking the CSF1PO row. Please see the image below. image

My Testing Data: html file: https://drive.google.com/file/d/1TVnvlsjfyvyVztLgyJCwDAOcD_So18vn/view?usp=sharing bam file: https://drive.google.com/file/d/1laubyYxi8OC46q0bM6zVyFqUBtVrk9wj/view?usp=sharing bam.bai: https://drive.google.com/file/d/1TcxqhoaY0Bkv2zdtgbHLAqYiGNJob-ch/view?usp=sharing

XuewenWangUGA commented 3 years ago

the reference is the hg38.fa

jrobinso commented 3 years ago

The problem here, I think, is the depth of coverage, more than 3,000X. I suggest you either downsample the file to a lower coverage, for example with samtools "subsample" command, or reduce the flanking region around the variant. For flanking you could try 100, if that's successful increase it until it fails.