karel-brinda / Phylign

Alignment against all pre-2019 bacteria on laptops within a few hours (former MOF-Search)
http://brinda.eu/mof
Other
25 stars 6 forks source link

ProtectedOutputException – Write-protected output files for rule decompress_and_run_cobs #84

Closed karel-brinda closed 2 years ago

karel-brinda commented 2 years ago

After during benchmarking, several processes stalled (see https://github.com/karel-brinda/mof-search/issues/83), I'm getting the following errors after re-running the pipeline:

$ make match
snakemake --jobs all --rerun-incomplete --printshellcmds --keep-going --use-conda --resources max_decomp_jobs=4 max_download_jobs=20 -- match
Batches: ['acinetobacter_baumannii__01', 'acinetobacter_baumannii__02', 'acinetobacter_nosocomialis__01', 'acinetobacter_pittii__01', 'actinobacillus_pleuropneumoniae__01', 'aeromonas_hydrophila__01', 'aeromonas_salmonicida__01', 'aeromonas_veronii__01', 'bacillus_anthracis__01', 'bacillus_cereus__01', 'bacillus_subtilis__01', 'bacillus_thuringiensis__01', 'bacteroides_fragilis__01', 'bordetella_bronchiseptica__01', 'bordetella_pertussis__01', 'borreliella_burgdorferi__01', 'brucella_abortus__01', 'brucella_melitensis__01', 'brucella_suis__01', 'burkholderia_cenocepacia__01', 'burkholderia_cepacia__01', 'burkholderia_contaminans__01', 'burkholderia_gladioli__01', 'burkholderia_multivorans__01', 'burkholderia_pseudomallei__01', 'burkholderia_ubonensis__01', 'burkholderia_vietnamiensis__01', 'campylobacter_coli__01', 'campylobacter_coli__02', 'campylobacter_coli__03', 'campylobacter_fetus__01', 'campylobacter_helveticus__01', 'campylobacter_jejuni__01', 'campylobacter_jejuni__02', 'campylobacter_jejuni__03', 'campylobacter_jejuni__04', 'campylobacter_jejuni__05', 'campylobacter_jejuni__06', 'campylobacter_jejuni__07', 'campylobacter_jejuni__08', 'campylobacter_lari__01', 'caulobacter_vibrioides__01', 'chlamydia_pecorum__01', 'chlamydia_trachomatis__01', 'citrobacter_freundii__01', 'citrobacter_rodentium__01', 'clostridioides_difficile__01', 'clostridioides_difficile__02', 'clostridioides_difficile__03', 'clostridioides_difficile__04', 'clostridium_botulinum__01', 'clostridium_perfringens__01', 'corynebacterium_diphtheriae__01', 'cronobacter_sakazakii__01', 'cutibacterium_acnes__01', 'dichelobacter_nodosus__01', 'dustbin__01', 'dustbin__02', 'dustbin__03', 'dustbin__04', 'dustbin__05', 'dustbin__06', 'dustbin__07', 'dustbin__08', 'dustbin__09', 'dustbin__10', 'dustbin__11', 'dustbin__12', 'dustbin__13', 'dustbin__14', 'dustbin__15', 'dustbin__16', 'dustbin__17', 'dustbin__18', 'dustbin__19', 'dustbin__20', 'dustbin__21', 'dustbin__22', 'elizabethkingia_anophelis__01', 'enterobacter_cloacae__01', 'enterobacter_hormaechei__01', 'enterococcus_faecalis__01', 'enterococcus_faecium__01', 'enterococcus_faecium__02', 'enterococcus_faecium__03', 'enterococcus_hirae__01', 'erysipelothrix_rhusiopathiae__01', 'escherichia_albertii__01', 'escherichia_coli__01', 'escherichia_coli__02', 'escherichia_coli__03', 'escherichia_coli__04', 'escherichia_coli__05', 'escherichia_coli__06', 'escherichia_coli__07', 'escherichia_coli__08', 'escherichia_coli__09', 'escherichia_coli__10', 'escherichia_coli__11', 'escherichia_coli__12', 'escherichia_coli__13', 'escherichia_coli__14', 'escherichia_coli__15', 'escherichia_coli__16', 'escherichia_coli__17', 'escherichia_coli__18', 'escherichia_coli__19', 'escherichia_coli__20', 'escherichia_coli__21', 'escherichia_coli__22', 'escherichia_coli__23', 'eubacterium_hallii__01', 'flavobacterium_johnsoniae__01', 'glaesserella_parasuis__01', 'haemophilus_influenzae__01', 'helicobacter_pylori__01', 'hungateiclostridium_thermocellum__01', 'klebsiella_aerogenes__01', 'klebsiella_oxytoca__01', 'klebsiella_pneumoniae__01', 'klebsiella_pneumoniae__02', 'klebsiella_pneumoniae__03', 'klebsiella_pneumoniae__04', 'klebsiella_quasipneumoniae__01', 'klebsiella_variicola__01', 'lactobacillus_casei__01', 'lactobacillus_plantarum__01', 'lactobacillus_rhamnosus__01', 'lactobacillus_salivarius__01', 'lactococcus_lactis__01', 'legionella_pneumophila__01', 'leptospira_interrogans__01', 'listeria_monocytogenes__01', 'listeria_monocytogenes__02', 'listeria_monocytogenes__03', 'listeria_monocytogenes__04', 'listeria_monocytogenes__05', 'listeria_monocytogenes__06', 'listeria_monocytogenes__07', 'mannheimia_haemolytica__01', 'mesorhizobium_ciceri__01', 'moraxella_catarrhalis__01', 'morganella_morganii__01', 'mycobacterium_avium__01', 'mycobacterium_bovis__01', 'mycobacterium_chimaera__01', 'mycobacterium_intracellulare__01', 'mycobacterium_kansasii__01', 'mycobacterium_tuberculosis__01', 'mycobacterium_tuberculosis__02', 'mycobacterium_tuberculosis__03', 'mycobacterium_tuberculosis__04', 'mycobacterium_tuberculosis__05', 'mycobacterium_tuberculosis__06', 'mycobacterium_tuberculosis__07', 'mycobacterium_tuberculosis__08', 'mycobacterium_tuberculosis__09', 'mycobacterium_tuberculosis__10', 'mycobacterium_tuberculosis__11', 'mycobacterium_tuberculosis__12', 'mycobacterium_tuberculosis__13', 'mycobacterium_ulcerans__01', 'mycobacteroides_abscessus__01', 'mycobacteroides_chelonae__01', 'mycolicibacterium_smegmatis__01', 'mycoplasma_bovis__01', 'mycoplasma_hyopneumoniae__01', 'mycoplasma_pneumoniae__01', 'neisseria_gonorrhoeae__01', 'neisseria_gonorrhoeae__02', 'neisseria_gonorrhoeae__03', 'neisseria_lactamica__01', 'neisseria_meningitidis__01', 'neisseria_meningitidis__02', 'neisseria_meningitidis__03', 'neisseria_meningitidis__04', 'neisseria_meningitidis__05', 'neisseria_subflava__01', 'oenococcus_oeni__01', 'paenibacillus_larvae__01', 'pasteurella_multocida__01', 'porphyromonas_gingivalis__01', 'prochlorococcus_marinus__01', 'proteus_mirabilis__01', 'pseudomonas_aeruginosa__01', 'pseudomonas_aeruginosa__02', 'pseudomonas_fluorescens__01', 'pseudomonas_putida__01', 'pseudomonas_syringae__01', 'rhizobium_leguminosarum__01', 'roseburia_hominis__01', 'salmonella_bongori__01', 'salmonella_enterica__01', 'salmonella_enterica__02', 'salmonella_enterica__03', 'salmonella_enterica__04', 'salmonella_enterica__05', 'salmonella_enterica__06', 'salmonella_enterica__07', 'salmonella_enterica__08', 'salmonella_enterica__09', 'salmonella_enterica__10', 'salmonella_enterica__11', 'salmonella_enterica__12', 'salmonella_enterica__13', 'salmonella_enterica__14', 'salmonella_enterica__15', 'salmonella_enterica__16', 'salmonella_enterica__17', 'salmonella_enterica__18', 'salmonella_enterica__19', 'salmonella_enterica__20', 'salmonella_enterica__21', 'salmonella_enterica__22', 'salmonella_enterica__23', 'salmonella_enterica__24', 'salmonella_enterica__25', 'salmonella_enterica__26', 'salmonella_enterica__27', 'salmonella_enterica__28', 'salmonella_enterica__29', 'salmonella_enterica__30', 'salmonella_enterica__31', 'salmonella_enterica__32', 'salmonella_enterica__33', 'salmonella_enterica__34', 'salmonella_enterica__35', 'salmonella_enterica__36', 'salmonella_enterica__37', 'salmonella_enterica__38', 'salmonella_enterica__39', 'salmonella_enterica__40', 'salmonella_enterica__41', 'salmonella_enterica__42', 'salmonella_enterica__43', 'salmonella_enterica__44', 'salmonella_enterica__45', 'salmonella_enterica__46', 'serratia_marcescens__01', 'shigella_dysenteriae__01', 'shigella_flexneri__01', 'sinorhizobium_meliloti__01', 'staphylococcus_agnetis__01', 'staphylococcus_argenteus__01', 'staphylococcus_aureus__01', 'staphylococcus_aureus__02', 'staphylococcus_aureus__03', 'staphylococcus_aureus__04', 'staphylococcus_aureus__05', 'staphylococcus_aureus__06', 'staphylococcus_aureus__07', 'staphylococcus_aureus__08', 'staphylococcus_aureus__09', 'staphylococcus_aureus__10', 'staphylococcus_aureus__11', 'staphylococcus_aureus__12', 'staphylococcus_aureus__13', 'staphylococcus_capitis__01', 'staphylococcus_epidermidis__01', 'staphylococcus_haemolyticus__01', 'staphylococcus_pseudintermedius__01', 'staphylococcus_sciuri__01', 'stenotrophomonas_maltophilia__01', 'streptococcus_agalactiae__01', 'streptococcus_agalactiae__02', 'streptococcus_agalactiae__03', 'streptococcus_dysgalactiae__01', 'streptococcus_equi__01', 'streptococcus_mitis__01', 'streptococcus_mutans__01', 'streptococcus_oralis__01', 'streptococcus_pneumoniae__01', 'streptococcus_pneumoniae__02', 'streptococcus_pneumoniae__03', 'streptococcus_pneumoniae__04', 'streptococcus_pneumoniae__05', 'streptococcus_pneumoniae__06', 'streptococcus_pneumoniae__07', 'streptococcus_pneumoniae__08', 'streptococcus_pneumoniae__09', 'streptococcus_pneumoniae__10', 'streptococcus_pneumoniae__11', 'streptococcus_pneumoniae__12', 'streptococcus_pneumoniae__13', 'streptococcus_pseudopneumoniae__01', 'streptococcus_pyogenes__01', 'streptococcus_pyogenes__02', 'streptococcus_pyogenes__03', 'streptococcus_pyogenes__04', 'streptococcus_pyogenes__05', 'streptococcus_sp_group_b__01', 'streptococcus_suis__01', 'streptococcus_uberis__01', 'taylorella_equigenitalis__01', 'treponema_pallidum__01', 'vibrio_cholerae__01', 'vibrio_cholerae__02', 'vibrio_parahaemolyticus__01', 'vibrio_shilonii__01', 'vibrio_vulnificus__01', 'wolbachia_endosymbiont_of_drosophila_melanogaster__01', 'xanthomonas_oryzae__01', 'yersinia_enterocolitica__01', 'yersinia_pestis__01', 'yersinia_pseudotuberculosis__01']
Query files: ['queries/ARGannot_r3.fa']
Building DAG of jobs...
Using shell: /bin/bash
Provided cores: 8
Rules claiming more threads will be scaled down.
Provided resources: max_decomp_jobs=4, max_download_jobs=20
Job stats:
job                        count    min threads    max threads
-----------------------  -------  -------------  -------------
decompress_and_run_cobs      305              1              1
match                          1              1              1
translate_matches              1              1              1
total                        307              1              1

Select jobs to execute...
ProtectedOutputException in line 249 of /Users/pseudokarel/github/my/mof-search/Snakefile:
Write-protected output files for rule decompress_and_run_cobs:
    output: intermediate/01_match/brucella_melitensis__01____ARGannot_r3.xz
    wildcards: batch=brucella_melitensis__01, qfile=ARGannot_r3
    affected files:
        intermediate/01_match/brucella_melitensis__01____ARGannot_r3.xz
make: *** [match] Error 1
leoisl commented 2 years ago

Yeah, this happens due to cancelling jobs and https://github.com/karel-brinda/mof-search/blob/25cf71e7374564a5e763f958563226085b0d73d3/Snakefile#L224

karel-brinda commented 2 years ago

Then it could be solved by this? https://github.com/karel-brinda/mof-search/issues/23

leoisl commented 2 years ago

I don't quite understand this though. The idea of the protected file there is that once you run run_cobs and get the matches, you protect the output against accidental deletion or overwriting. If some minimap2 failed (e.g. https://github.com/karel-brinda/mof-search/issues/83), and I had some failing in the past, the run_cobs rules should not be rerun - snakemake sees that these jobs were already run, the output is already there, and just go straight to the minimap2 rules. So I don't understand why these jobs are being re-run if the output files are already generated. Have you changed some earlier rule or conda env? Anyway, I had these issues in the past - if we change a previous rule or a conda env, snakemake rightfully wants to recreate these files and thus rerun these jobs.

23 could solve the issue, but why not just remove the protected flag?

karel-brinda commented 2 years ago

https://github.com/karel-brinda/mof-search/issues/23 could solve the issue, but why not just remove the protected flag?

ok