NBChub / bgcflow

Snakemake workflow for the analysis of biosynthetic gene clusters across large collections of genomes (pangenomes)
https://github.com/NBChub/bgcflow/wiki
MIT License
35 stars 9 forks source link

Error in Generating Markdown Report for eggNOG-Roary - Proposed Solution #360

Open andrekind17 opened 1 month ago

andrekind17 commented 1 month ago

Hi Matin, when I run bgcflow build report on the Lactobacillus_delbrueckii example dataset, I get this error for the generation of the eggNOG-Roary markdown:

Error in rule mkdocs_py_report: jobid: 31 input: data/processed/Lactobacillus_delbrueckii/metadata/dependency_versions.json, data/processed/Lactobacillus_delbrueckii/docs/eggnog-roary.ipynb output: data/processed/Lactobacillus_delbrueckii/docs/eggnog-roary.md log: logs/report/eggnog-roary-report-Lactobacillus_delbrueckii.log (check log file(s) for error details) conda-env: /mnt/data/studenti/agentile/bgcflow/.snakemake/conda/e9d7fcc98b88235172f4b2a161bfc170_ shell: jupyter nbconvert --to markdown --execute data/processed/Lactobacillus_delbrueckii/docs/eggnog-roary.ipynb --no-input --output eggnog-roary.md 2>> logs/report/eggnog-roary-report-Lactobacillus_delbrueckii.log

(one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!) Exiting because a job execution failed. Look above for error message Complete log: .snakemake/log/2024-09-26T090317.499563.snakemake.log WorkflowError: At least one job did not complete successfully.

I examined the log file (attached) and I noticed a problem in executing the Rare pangene category cell:

KeyError: 'Rare'

eggnog-roary-report-Lactobacillus_delbrueckii.log

I thought it's because rare pangenes are not classified in the Lactobacillus_delbrueckii example dataset, so the cell can't be run for the generation of the markdown.

Indeed, I erased that cell and I managed to generate and serve that report.

I think you should include a rule to skip cells if pangenes for those categories haven't been classified, so that reports can be built and served correctly with the information available,

matinnuhamunada commented 1 month ago

Thanks for the feedback and suggestion! Will follow up on this.

andrekind17 commented 1 month ago

Hi Matin, I actually get the same error also on my dataset! Maybe it's a problem related to the notebook cell?