merenlab / anvio

An analysis and visualization platform for 'omics data
http://merenlab.org/software/anvio
GNU General Public License v3.0
423 stars 144 forks source link

[BUG] 'anvi-profile' was killed but profile.db is still written #1666

Open bentpetersendk opened 3 years ago

bentpetersendk commented 3 years ago

I recently found this issue 'https://github.com/merenlab/anvio/issues/632' which is still valid, but for other reasons.

For some reason 4 of my anvi-profile runs were killed: "* Anvi'o profiler received SIGINT, terminating all processes..."

The problem is just that the rest of the process continued and I ended with a logfile saying "Happy", which it should not be ;-)

All profile.db were created, including all the following step. But, when I came to anvi-merge I got this error:

Traceback (most recent call last): File "/home/bent/.conda/envs/anvio-7/bin/anvi-merge", line 51, in merger.MultipleRuns(args).merge() File "/home/bent/.conda/envs/anvio-7/lib/python3.6/site-packages/anvio/merger.py", line 566, in merge self.gen_view_data_tables_from_atomic_data() File "/home/bent/.conda/envs/anvio-7/lib/python3.6/site-packages/anvio/merger.py", line 670, in gen_view_data_tables_from_atomic_data self.normalized_coverages[target][split_name][input_profile_db_path] = self.get_normalized_coverage_of_split(target, input_profile_db_path, split_name) File "/home/bent/.conda/envs/anvio-7/lib/python3.6/site-packages/anvio/merger.py", line 581, in get_normalized_coverage_of_split return self.atomic_data_for_each_run[target][input_profile_db_path][split_name]['mean_coverage_Q2Q3'] * self.normalization_multiplier[input_profile_db_path] KeyError: ‘ww_000000015438_split_00001'

I suspect that the reason for the error was the 4 profile runs which were killed, but still created a profile file for each.

Just to let you know that anvi'o still need to have a mechanism in place which detects broken profiles.

I am using the newest version of anvi'o:

anvi-merge --version Anvi'o .......................................: hope (v7)

Profile database .............................: 35 Contigs database .............................: 20 Pan database .................................: 14 Genome data storage ..........................: 7 Auxiliary data storage .......................: 2 Structure database ...........................: 2 Metabolic modules database ...................: 2 tRNA-seq database ............................: 1

meren commented 3 years ago

Thank you, @bentpetersendk.

I guess the best way to do it will be a db integrity check at the end of anvi-profile, right before 'Happy' :)

Best,