Closed jvhagey closed 5 years ago
Hi @jvhagey,
Apologies for the late reply. I just played with this a bit to turn that error into a warning for users who know what they're doing, but I realized that it causes another problem now. This is probably associated with the new release of HMMER. I'm using v3.2
for your reference.
When I have multiple HMM entires with the same NAME and still different ACC properties, hmmpress
gives me the following error:
Working... SSI index construction failed:
primary keys not unique: 'GENE_NAME' occurs more than once
Can you try and tell me if this is the case for you when you use hmmpress
with your HMMs file? If you are not getting the same error can you please share the HMMER version you're using?
Thanks,
I used hmmer v3.1b2
and didn't get that error when I used hmmpress
, it's good to know for the future though. Maybe since this shouldn't happen in the future given hmmer newest version there isn't a big reason to change this in anvio. To get around this for my purposes I just wrote a short little script that rewrites the hmm file with numbered names for duplicate entries.
Thank you, Jill. I think your solution is the most reasonable approach given the rare need for this. Alternatively anvi'o could have given a warning instead of an error, and fail gracefully if hmmpress returns an error depending on the user version, but I think it will be an overkill :)
I am closing this issue for now and will update the current error message so it is clear to people why we are not allowing that.
Thanks for your patience.
Best wishes,
Hi Anvi'o team, I am running anvi-run-hmm on a custom set of hmms from the FOAM database (https://academic.oup.com/nar/article/42/19/e145/2902479). Currently, anvio (v5.5, installed via conda in its own environment) gives a config error that correctly points out that there are entry names that appear more than once in the genes.txt file.
The hmm file from the FOAM data has duplicate names like this:
I had run this same custom hmm with Anvi'o 5.3 without this error so it seems like a new guard rail. Can we put in a --just-do-it argument as I did mean to do this or is that going to screw things up horribly?