Clinical-Genomics / scout

VCF visualization interface
https://clinical-genomics.github.io/scout
BSD 3-Clause "New" or "Revised" License
152 stars 46 forks source link

Integrate GraphAlignmentViewer #2453

Closed bjhall closed 1 year ago

bjhall commented 3 years ago

We've had requests from users to integrate GraphAlignmentViewer (https://github.com/Illumina/GraphAlignmentViewer) in Scout for visualizing STRs in a better way than just IGV.

The problem is that GraphAlignmentViewer generates a static image for every STR locus in the catalog, and I can't think of a clean way to load this into scout. Does anyone have a good idea? :)

Someone in Lund can of course work on implementing this once we agree on a design.

northwestwitch commented 3 years ago

Creating a static image for each locus doesn't look like a very good solution, unless these images are created on the fly.. Perhaps there is some other software that allows to view STRs in a igv-like fashion?

northwestwitch commented 3 years ago

Looks like GraphAlignmentViewer has the option to output PDF files. What about merging those files in one main report and make it available in Scout?

bjhall commented 3 years ago

Feels a bit inconvenient to scroll through a huge PDF whenever you want to look at an STR locus.

But if we can't think of a better solution I guess that will have to do.

northwestwitch commented 3 years ago

You can a PDF with bookmarks and somehow link to these bookmarks from the STRs variants page? Not ideal but perhaps better then having to deal with a big number of pngs.. Perhaps there's a better way, I don't know 🤔

dnil commented 3 years ago

Yes, I have been eyeing REViewer which is a slightly newer creation from the same group. It sort of suffers the same though: If I recall correctly, it does svgs, which I suppose could be grouped prior to display, but I haven't really played with it. It would be very soft with like a js graph alignment viewer (development suggestion for any masters students / beginning PhDs out there) instead, and just feed it the bam-let from expansion hunter or the like.

northwestwitch commented 3 years ago

Nice with ReViewer, bam files are already a better input than pileups!

dnil commented 3 years ago

One way would then be to loop over the loci of interest, then use something like https://github.com/Climenty/svg-join to create an svg bundle, and link into it with bookmarks case_STRs_bundle#bean1 or such. The problem now is that we have gene panels for the STRs, and a joint bundle could inadvertently show an expansion you absolutely did not want to see. 🙄

dnil commented 3 years ago

It is pretty though; like this Screenshot 2021-03-18 at 13 36 32

ViktorHy commented 3 years ago

yes, that's prettier than GraphAlignmentViewer atleast

bjhall commented 3 years ago

SVGs are just text. Would it be such a bad idea to just load all the SVGs into a mongo collection and just show them on request?

If they're not too large I guess.

ViktorHy commented 3 years ago

what if you create a PDF and then just extract certain pages from the PDF, matching the gene panel?

pdftk full-pdf.pdf cat 12-15 output outfile_p12-15.pdf, like so? or is this a nogo?

dnil commented 3 years ago

Everything is just binary anyway! 😸 Both are fine; easiest is probably just to use the mechanism we have for chromograph pictures (those are pngs; loading and serving multiple files with a common prefix), but it would not be bad to replace them also eventually with what @bjhall suggested. They scale with a constant times N_cases so its quite fine to load to db.

dnil commented 3 years ago

what if you create a PDF and then just extract certain pages from the PDF, matching the gene panel?

The thing is it would have to be run on request; we have tried to stay away from waiting-for-cgi-calls-to-run. It is not impossible, and could be queued somewhere to be fetched later, but I feel more work?

dnil commented 3 years ago

what if you create a PDF and then just extract certain pages from the PDF, matching the gene panel?

Or do you know of like a js tool that would do that, ideally client side?

ViktorHy commented 3 years ago

what if you create a PDF and then just extract certain pages from the PDF, matching the gene panel?

Or do you know of like a js tool that would do that, ideally client side?

Not by heart. bjhall made the same point about waiting-for-cgi-calls-to-run on slack. I guess his solution would be nicer

dnil commented 3 years ago

The svgs appear to be on the order of hundreds of Kb, so not huge.

dnil commented 3 years ago

We could use GridFS with pymongo I suppose. That would be rather convenient to speed things up and not rely on the regular fs mounts, but we would still have to tell Scout where to find them using eg the case configs. Unless we treat that as a start of a local "data lake" or such and tag the files appropriately.. That sort of depends on your setup; for us we have more store on a regular external fs than on the db volumes, so we wouldn't go full "lake" on like crams in the near term. But some images are fine.

Or we just add one more set of images with the same mechanism as Chromograph for now. That is straightforward, and we can generalise all of them in a second step. Both solutions have merit.

northwestwitch commented 3 years ago

What about this one: https://mgymrek.github.io/pybamview/index.html. Looks like it has not been touched for many years, but looks easy enough to integrate / emulate

dnil commented 3 years ago

Yes, something like that! Hm, but does it really have graph support? It looks more like a pileup style thing?

northwestwitch commented 3 years ago

Look, @moonso is a contributor on that repo!

dnil commented 3 years ago

Time flies! We'll have a summer student look at this, starting early June.

dnil commented 1 year ago

We now have integration of REViewer via the Scout-Reviewer-Service extension and micro service. Lets close this for now, and reopen when we have a case for a full graph aligner again.