Closed samnooij closed 5 years ago
Hi,
I experience troubles with this rule as well, probably because of the apparent absence of Archaea in my data. Could it be that the quantify_output is also hampered by the same cause? Is there a relationshio with the Concat_files rule? Concat_files.log is empty, jut like draw_heatmaps.log and qunatify_output.log
This is my error (at the very last bit of the workflow):
[Fri Apr 19 11:04:23 2019]=============----------------------] 62.5% - Reading files [ 45 / 72 ] Error in rule draw_heatmaps: jobid: 268 output: results/heatmaps/Superkingdoms_heatmap.html, results/heatmaps/Virus_order_heatmap.html, results/heatmaps/Virus_family_heatmap.html, results/heatmaps/Virus_genus_heatmap.html, results/heatmaps/Virus_species_heatmap.html, results/heatmaps/Phage_order_heatmap.html, results/heatmaps/Phage_family_heatmap.html, results/heatmaps/Phage_genus_heatmap.html, results/heatmaps/Phage_species_heatmap.html, results/heatmaps/Bacteria_phylum_heatmap.html, results/heatmaps/Bacteria_class_heatmap.html, results/heatmaps/Bacteria_order_heatmap.html, results/heatmaps/Bacteria_family_heatmap.html, results/heatmaps/Bacteria_genus_heatmap.html, results/heatmaps/Bacteria_species_heatmap.html, results/Taxonomic_rank_statistics.tsv, results/Virus_rank_statistics.tsv, results/Phage_rank_statistics.tsv, results/Bacteria_rank_statistics.tsv log: logs/draw_heatmaps.log conda-env: /data/BioGrid/ERVINGS/Runs_Respiratory_MiSEQ_RIVM/MiSeq_RUN_12APR2019/Jovian/.snakemake/conda/9d3d5b4d
ClusterJobException in line 687 of /data/BioGrid/ERVINGS/Runs_Respiratory_MiSEQ_RIVM/MiSeq_RUN_12APR2019/Jovian/Snakefile: Error executing rule draw_heatmaps on cluster (jobid: 268, external: 225321, jobscript: /data/BioGrid/ERVINGS/Runs_Respiratory_MiSEQ_RIVM/MiSeq_RUN_12APR2019/Jovian/.snakemake/tmp.vw572phx/Jovian_draw_heatmaps.jobid268). For detailed error see the cluster log. Job failed, going on with independent jobs.------------------] 70.8% - Reading files [ 51 / 72 ] Done counting!===============================================] 100.0% - Reading files [ 72 / 72 ] Traceback (most recent call last): File "/data/BioGrid/ERVINGS/Runs_Respiratory_MiSEQ_RIVM/MiSeq_RUN_12APR2019/Jovian/.snakemake/scripts/tmpecjkhsji.quantify_profiles.py", line 487, in
main() File "/data/BioGrid/ERVINGS/Runs_Respiratory_MiSEQ_RIVM/MiSeq_RUN_12APR2019/Jovian/.snakemake/scripts/tmpecjkhsji.quantify_profiles.py", line 421, in main "Eukaryota", "Viruses", "Unclassified" ]] File "/data/BioGrid/ERVINGS/Runs_Respiratory_MiSEQ_RIVM/MiSeq_RUN_12APR2019/Jovian/.snakemake/conda/9d3d5b4d/lib/python3.7/site-packages/pandas/core/frame.py", line 2682, in getitem return self._getitem_array(key) File "/data/BioGrid/ERVINGS/Runs_Respiratory_MiSEQ_RIVM/MiSeq_RUN_12APR2019/Jovian/.snakemake/conda/9d3d5b4d/lib/python3.7/site-packages/pandas/core/frame.py", line 2726, in _getitem_array indexer = self.loc._convert_to_indexer(key, axis=1) File "/data/BioGrid/ERVINGS/Runs_Respiratory_MiSEQ_RIVM/MiSeq_RUN_12APR2019/Jovian/.snakemake/conda/9d3d5b4d/lib/python3.7/site-packages/pandas/core/indexing.py", line 1327, in _convert_to_indexer .format(mask=objarr[mask])) KeyError: "['Archaea'] not in index" [Fri Apr 19 11:04:39 2019] Error in rule quantify_output: jobid: 269 output: results/profile_read_counts.csv, results/profile_percentages.csv, results/Sample_composition_graph.html log: logs/quantify_output.log conda-env: /data/BioGrid/ERVINGS/Runs_Respiratory_MiSEQ_RIVM/MiSeq_RUN_12APR2019/Jovian/.snakemake/conda/9d3d5b4d RuleException: CalledProcessError in line 681 of /data/BioGrid/ERVINGS/Runs_Respiratory_MiSEQ_RIVM/MiSeq_RUN_12APR2019/Jovian/Snakefile: Command 'source /mnt/miniconda/bin/activate '/data/BioGrid/ERVINGS/Runs_Respiratory_MiSEQ_RIVM/MiSeq_RUN_12APR2019/Jovian/.snakemake/conda/9d3d5b4d'; set -euo pipefail; python /data/BioGrid/ERVINGS/Runs_Respiratory_MiSEQ_RIVM/MiSeq_RUN_12APR2019/Jovian/.snakemake/scripts/tmpecjkhsji.quantify_profiles.py' returned non-zero exit status 1. File "/data/BioGrid/ERVINGS/Runs_Respiratory_MiSEQ_RIVM/MiSeq_RUN_12APR2019/Jovian/Snakefile", line 681,in __rule_quantify_output File "/home/janssetk/.conda/envs/Jovian_master/lib/python3.6/concurrent/futures/thread.py", line 56, in run Job failed, going on with independent jobs. Exiting because a job execution failed. Look above for error message Complete log: /data/BioGrid/ERVINGS/Runs_Respiratory_MiSEQ_RIVM/MiSeq_RUN_12APR2019/Jovian/.snakemake/log/2019-04-19T110018.225554.snakemake.log
This is also the cause of issue #26
Fixed in v0.9.2 which will be made available this afternoon. Closing.
The current heatmap script can only work if there are e.g. viruses in the final taxonomic classifications. If there are none, the Python script cannot draw the heatmaps and Snakemake will see that not all output for the rule "draw_heatmaps" can be generated. Therefore, we need some checks and work-around for datasets that do not have any of the expected taxa. (E.g. have the Python script create empty files for unobserved taxa and write a little warning to the terminal/a log file?)
My current goal to make this work better would be to:
Suggested solution
Remake the script and take into account:
(Bold = priority, other points = of secondary importance.)
_Note: these solutions need changes in the Python script itself, the Snakefile, and possibly also the pipelineparameters.yaml file!