igvteam / igv

Integrative Genomics Viewer. Fast, efficient, scalable visualization tool for genomics data and annotations
https://igv.org
MIT License
646 stars 387 forks source link

Adding support for accessing remote Genozip-compressed BAM files - initial enquiry #1398

Closed divonlan closed 10 months ago

divonlan commented 1 year ago

Greetings,

I am the author of Genozip (www.genozip.com), a compression software for BAM / FASTQ / VCF etc. We have users who need to open remote (I.e. by URL) BAM files compressed with Genozip (which have a .bam.genozip file name exteension) in IGV.

As one possible solution, I am considering writing an IGV patch that will allow a user to specify a .bam.genozip remote file in the existing "Open from URL" UI. Then, IGV will call a Genozip library (wrriten in C) via JNI for fetching and decompreesing required BAM byte ranges. IGV will check if the library is available on the system and then functionality will be available only if the library is installed. No changes to the UI and no changes to using a normal BAI file for determining the ranges.

I haven't yet looked at the code very carefully, as I thought it would be prudent to first enquire if such a patch would be welcome in principal before spending development time on this direction. What say you?

Thanks! Divon

jrobinso commented 1 year ago

Hi Divon, thanks for reaching out. I don't think we could accept a solution that involves JNI. For one thing, I don't know if this would affect our notarization with Apple, for which we use a hardened runtime, but I suspect it might. But more importantly, we can't run general unknown code from IGV. So I have another solution to suggest. If you could distribute a simple web service that listens on a port on the users computer we could communicate with that. Its very simple to implement such a service, in fact IGV itself listens on a port. We could then pass on requests for ".genozip" files to this service with get requests. Something like

http://localhost:<port>/?url=<url to the genozip file>

or if you prefer a more restful style

http://localhost:<port>/<url to the genozip file>

The range could be sent as either a byte range, or a genomic range if that is easier. Again implementing an http listener like this is easy, and there are a number of open source servers and libraries if you prefer.