Closed jolespin closed 6 months ago
Hi @jolespin, thanks for the code to reproduce, I'm getting the same issue on my local system. What's strange is that the pickling of the cutoff is tested (I'm using PF02826
, and there it works), but indeed for PF09847
the cutoffs get messed up. I'll have to figure out what's going on :sweat:
The culprits:
In the last memcpy
, there is a typo, the model compositions are actually copied into the cutoffs, and overwrites the actual cutoffs. This probably wasn't an issue with PF02826
because it doesn't have a model composition set.
Fixed in v0.10.12
.
Excellent! Thank you for handling this so quickly. I was sitting with it for a day or two trying to figure out if I was making an obvious mistake. Glad the deep dive wasn't for nothing and could help with your development.
Took a look at your PR, why does the switch from t2pks and RREFam fix the pickling issue (IIRC the .h3m are one of the indexed HMM files)?
"with pyhmmer.plan7.HMMFile(\"data/hmms/bin/t2pks.h3m\") as hmms:\n",
"with pyhmmer.plan7.HMMFile(\"data/hmms/bin/RREFam.h3m\") as hmms:\n",
Ah no, the switch is for something different, I just wanted to reduce the size of the test data (mostly because I'm starting to hit the PyPI.org project size limit given that the wheels are getting big and I have 30+ wheels per release) so I used smaller HMMs.
The fix is just in 7b1ffb8 :smile:
It took me a while to figure out what was happening or how to get a minimum reproducible example but I think I finally have one.
For some reason, when I'm pickling my HMMs the cutoffs change and I'm not sure exactly why or how this could happen.
Can you give it a try on your system?
I'm just using
Pfam-A.hmm.gz
from https://www.ebi.ac.uk/interpro/download/Pfam/Let me know if I can provide anymore context.