Closed laserson closed 2 years ago
I haven't tried Numba (yet). I tried messing around with cython (and failed iirc) but I'll check out Numba, would be interesting. I'd be all for trying to improve its performance, I remember it starting to chug along when I got up to the hundreds of bp
I love how your code makes heavy use of the latest in typing annotations, but strangely, I think this may be something that the Numba compiler is choking on. (Since that stuff is not yet implemented there.) I would guess that if things moved to be more "transparent" objects for the parameters, it would probably compile very readily, and have huge speedup.
There are a few other things that probably need to get nixed for Numba, like replacing generators with list comprehensions.
I tried with Numba and felt like I got further to something working that I had with Cython, but no dice.
I did however try out pypy3 and noticed that it's significantly faster for larger sequences. ~2sec vs ~15 for a 200bp seq. I added some notes on such to the README: https://github.com/Lattice-Automation/seqfold
If you're not already, that's probably the way to go for a quick performance gain
I'm still interested in Numba/Cython for what it's worth, but given how both handle lists, I'd have to re-write much of the code (probably w/ Numpy). I know now that I should've gone with Numpy from the start, but at this point it feels too large a re-write
I tried to Numba jit the
dg
function as a longshot and it appears that it failed. Have y'all tried Numba compiling any parts of this to increase the performance?