jmschrei / tangermeme

Biological sequence analysis for the modern age.
MIT License
167 stars 10 forks source link

Error in plot_logo - LogomakerError: type(floor) = <class 'numpy.float32'> must be a number #17

Open Al-Murphy opened 3 weeks ago

Al-Murphy commented 3 weeks ago

Running the code in the tutorial:

import torch
from tangermeme.utils import random_one_hot
from matplotlib import pyplot as plt
import seaborn; seaborn.set_style('whitegrid')
#from tangermeme.plot import plot_logo

X = random_one_hot((1, 4, 2000), random_state=0).type(torch.float32)

plt.figure(figsize=(10, 2))
ax = plt.subplot(111)
plot_logo(X[0, :, 950:1050], ax=ax)

plt.xlabel("Genomic Position")
plt.ylabel("Value")

plt.tight_layout()
plt.show()

Results in:

LogomakerError: type(floor) = <class 'numpy.float32'> must be a number

Not sure if this is specific to my set up with (Tangermeme v0.2.3) or if it wasn't caught as I couldn't see any tests for plot_logo. However I managed to resolve the issue by switching from type(torch.float32) to type(torch.float64) as follows:

X = random_one_hot((1, 4, 2000), random_state=0).type(torch.float64)

The error comes back to logomaker here, checking if a variable floor is isinstance(floor, (float, int)). Floor is calculated per index position p of the dataframe as:

vsep = 0.0
# get values at this position
vs = np.array(df_.loc[p, :])
# Set floor
floor = sum((vs - vsep) * (vs < 0)) + vsep/2.0

So I'm not totally sure why it fails with int32 but not int64?

Alan.

jmschrei commented 3 weeks ago

That's weird. Your code works on my side, and I don't think I've changed anything about that code recently. What versions of logomaker and PyTorch are you using?

Al-Murphy commented 3 weeks ago

That is weird - logomaker v0.8, torch v2.4.0 and since it seems relevant numpy v2.0.1