broadinstitute / seqr

web-based analysis tool for rare disease genomics
GNU Affero General Public License v3.0
176 stars 88 forks source link

IGV gCNV viewer performance improvement #2188

Open ShifaSZ opened 2 years ago

ShifaSZ commented 2 years ago

Both TGG-Viewer and seqr integrate the Javascript version of IGV which also is called IGV.js.

The IGV.js downloads data gCNV data very frequently and responds very slowly. Every time when opening a new gCNV track, zooming in/out, or resizing the gCNV tracks, the IGV will need to download all data rows for the current chromosome again.

For seqr, the IGV.js is even slower because a different version of Javascript core is used.

See more detailed analysis in this doc.

As analyzed in the doc, we could probably take the below measurements to improve the performance: 1) We should figure out how to improve the performance of the “parseFloat” function by upgrading the core-js library.

2) We may split the gCNV datasets into more clusters so that each cluster has fewer samples.

3) We may suggest IGV improvement for gCNV. One is to load the data for one time and reuse them when zooming in/out or changing location in the same chromosome. The other further optimization is to only load the gCNV data within or close to the visible window rather than load all data within the chromosome.

ShifaSZ commented 2 years ago

I would like to add another issue found last time before I forget. For the gCNV viewer in seqr, it always loads the data twice. It can be observed by checking the network status in the devtool. image It can be solved by setting the doResize parameter to false when calling the loadtrack function like:

this.browser.loadTrack(track, false)

And also avoid passing in the tracks parameter when calling igv.createBrowser. Instead, call loadTrack with a false for doResize in the then clause.

ShifaSZ commented 2 years ago

Here is a quick patch for igv.js to fix the low-performance issue of the gCNV viewer: Replace the first statement in the method of getFeatures in class GCNVTrack with:

        const chrFeatures = await this.featureSource.getFeatures({
          chr,
          start: max(start-1000, 0), //0,
          end: end+1000, //Number.MAX_VALUE
          visibilityWindow: end - start,
        });
hanars commented 2 years ago

Can you PR up the changes you describe in your comment for doResize and loadTrack? This seems like a real quick win for performance and while we don't have time for the more intensive solutions in this ticket, that should be quick enough that its worth doing now

For the "patch" you propose to IGV, I strongly disapprove of using patched dependencies. At some point we can propose a PR to the IGV library itself, but in the meantime we will continue to use it as is