Closed wwood closed 9 years ago
Thanks Ben,
Thanks for using calculate_ani.py
, and I'm glad to hear that you (usually) find it useful.
The reason why the program just continues is because all the command line jobs are scheduled using an asynchronous pool in multiprocessing
. This makes it harder to get at subprocess exceptions than usual for two main reasons: the processes are not necessarily called in order, and there's extra work in rerouting exceptions through the subprocessing - they're not readily available through the multiprocessing
interface. It's not impossible, but I didn't implement it because it was as easy (for me) to inspect the output to identify which job failed, and rerun the MUMmer code to determine the error directly. I'll look into it further, though.
Why not create a third file with perc_id * perc_aln
? Mainly because I didn't see the value - it's not clear what the output means. If there's an entry of 0.75
in a cell of that table, what does it mean? 100% identity, and 75% coverage, or the reverse? 80% coverage, and 94% identity, or the reverse? and so on. I don't see that it's a very meaningful thing to calculate, and I don't want to encourage calculation of values that I don't see as useful ;)
Best wishes,
L.
Closed as thread went silent.
Thanks for the response tho.
Hi,
I sometimes use your script calculate_ani (thanks..) for comparing genomes that may or may not have any sequence overlap, and I got this error:
After a bunch of time spent digging the problem boiled down to having a genome where the names in the fasta file were too long, and this is known to cause problems - http://sourceforge.net/p/mummer/mailman/mummer-help/thread/4C9CDBBB.1060705@bcgsc.ca/
Would it be possible to provide information on which mummer run failed (why not just raise an error when exitstatus != 0 and give the cmdline?), rather than letting the program continue, it it fails later on anyway?
While I'm here, a third file output which is simply perc_id * perc_aln would be handy - of course this is easy to create myself in a spreadsheet after the fact, though, so I'm not requesting this very ardently...
Thanks, ben