plotly / dash-bio

Open-source bioinformatics components for Dash
https://dash-gallery.plotly.host/Portal/?search=Bioinformatics
MIT License
531 stars 192 forks source link

ManhattanPlot: Incorrect tooltip information for all variants below threshold #760

Closed shelpuk closed 1 year ago

shelpuk commented 1 year ago

ManhattanPlot displays incorrect SNP, GENE and annotation for all variants below the threshold.

For example, here is the visualization from the official manual using the suggested CSV input data. Take a look, for example, at this variant on chromosome 8:

image

As you can see, SNP shows dbSNP id: rs12092772. Now, in the CSV input data, this dbSNP id corresponds to a completely different variant on chromosome 1:

image

There is no other occurrence of this dbSNP id anywhere in the input CSV file.

All SNP, GENE and annotation information from all variants below the threshold somehow comes from chromosome 1 only. Yet, the p-values are correct.

This bug is absent for the variants above the thresholds.

To Reproduce Just open the official ManhattanPlot documentation and check any variant below the threshold. Pick any variant below the threshold. Search for the suggested dbSNP id in the input CSV file.

Expected behavior The tooltip should display additional information from the same dataframe row as the p-value.

Screenshots If applicable, add screenshots to help explain the issue.

Python version: [e.g., 3.7.2] Python 3.9.13

Python environment (all installed packages in your current environment): I do not believe this is relevant since the issue is present even in the official documentation.

shelpuk commented 1 year ago

Here is the root cause of this bug.

_manhattan.py, line 562:

            for i in data[self.index].unique():

                tmp = data[data[self.index] == i]

                chromo = tmp[self.chrName].unique()  # Get chromosome name

                hover_text = _get_hover_text(
                    data,
                    snpname=self.snpName,
                    genename=self.geneName,
                    annotationname=self.annotationName
                )

_get_hover_text operates on a single row only, but the entire dataframe is passed. It seems like tmp, and not data should be passed to the function. With this change, the hover text works properly.