Closed shelpuk closed 1 year ago
Here is the root cause of this bug.
_manhattan.py, line 562:
for i in data[self.index].unique():
tmp = data[data[self.index] == i]
chromo = tmp[self.chrName].unique() # Get chromosome name
hover_text = _get_hover_text(
data,
snpname=self.snpName,
genename=self.geneName,
annotationname=self.annotationName
)
_get_hover_text
operates on a single row only, but the entire dataframe is passed. It seems like tmp
, and not data
should be passed to the function. With this change, the hover text works properly.
ManhattanPlot displays incorrect SNP, GENE and annotation for all variants below the threshold.
For example, here is the visualization from the official manual using the suggested CSV input data. Take a look, for example, at this variant on chromosome 8:
As you can see, SNP shows dbSNP id: rs12092772. Now, in the CSV input data, this dbSNP id corresponds to a completely different variant on chromosome 1:
There is no other occurrence of this dbSNP id anywhere in the input CSV file.
All SNP, GENE and annotation information from all variants below the threshold somehow comes from chromosome 1 only. Yet, the p-values are correct.
This bug is absent for the variants above the thresholds.
To Reproduce Just open the official ManhattanPlot documentation and check any variant below the threshold. Pick any variant below the threshold. Search for the suggested dbSNP id in the input CSV file.
Expected behavior The tooltip should display additional information from the same dataframe row as the p-value.
Screenshots If applicable, add screenshots to help explain the issue.
Python version: [e.g., 3.7.2] Python 3.9.13
Python environment (all installed packages in your current environment): I do not believe this is relevant since the issue is present even in the official documentation.