igvteam / igv-reports

Python application to generate self-contained pages embedding IGV visualizations, with no dependency on original input files.
MIT License
350 stars 52 forks source link

Option to filter/hide duplicate reads in igv-reports? #68

Closed satyakam001 closed 2 years ago

satyakam001 commented 2 years ago

Hello! While using the tool, there are snapshots containing duplicated reads from the BAM file. Is there an option to hide/remove these duplications? I saw that igv.js has had something like this resolved- is there something I can implement for igv-reports? Thank you

jrobinso commented 2 years ago

This should be possible. I can't give an ETA, but if you want to do this yourself the relevant code is in "igv_reports/bam.py", specifically the "splice" function. This uses pysam, which in turn uses samtools. I think adding the filter "-F 1024" to args should filter reads marked duplicate, but that would need to be confirmed with a test.

satyakam001 commented 2 years ago

So I have not installed the program due to python and pip issues which I can't control, but I'm working with the docker container found at https://bioconda.github.io/recipes/igv-reports/README.html. Do you think interactively running the docker container, finding the bam.py code within, making the said change, and running the docker container again is going to work (if I need to re-build it before running, how could I do that) ? Or is it more complicated than that? (Please treat me like a beginner)

jrobinso commented 2 years ago

@satyakam001 Sorry I really don't know anything about Docker but it should work.

You do not need to build if you run the python directly, like this for example (from root directory)

python igv_reports/report.py

To build a distribution

python setup.py sdist bdist_wheel
jrobinso commented 2 years ago

This should be fairly easy, but I need to test it. If you can supply a small BAM file with duplicates that would be helpful and maybe I can get to it in the next few days.

satyakam001 commented 2 years ago

Yeah if I was running it directly, I would follow the above commands; I'll have to see how to implement in the docker container- if you could edit the .py file, build the newer image with the edited .py file and push the container, I can pull it and use it. Annyway, I tried uploading a small test bam file with duplications; it says that the file type isn't supported- can I not attach bam files?

jrobinso commented 2 years ago

Try zipping it first. You should be able to upload a zip file.

satyakam001 commented 2 years ago

test_bam_igv_report.zip Please let me know if this doesn't work, I can send other files as well. Thank you!

jrobinso commented 2 years ago

That should work.

jrobinso commented 2 years ago

His, this should be fixed now in release 1.6.0. By default alignments marked duplicate will be filtered, but this can be controlled with the optional --exclude-flags parameter. For example

--exclude-flags 1536   Alignments marked duplicate.
--exclude-flags 1024   Alignments marked duplicate or vendor failed are filtered.
--exclude-flags 0        No alignments are filtered
jrobinso commented 2 years ago

Correction -- use release 1.6.1 to use the --exclude-flags option. Version 1.6.0 will filter duplicates but ignores --exclude-flags.

satyakam001 commented 2 years ago

Thank you so much, I shall be sure to try and let you know if there are any issues. But thank you for this!