Open SolayMane opened 1 month ago
Hi Solay,
I saw that error many times when some files are not found in the right place (i.e., where Snakemake expects them to be).
Where are your hifi_reads.fastq.gz
and hic_reads.fastq.gz
? Are these paths correctly set up in the config file? Are they in fastq.gz
format?
Also, it would be useful to test the workflow before running it on real data, as explained here: https://github.com/LiaOb21/colora?tab=readme-ov-file#test-the-pipeline
Let me know! :blush:
here is my config file:
# config.yaml for real data
# Set memory and threads for high demanding rules
high:
mem_mb: 409600 # memory in MB
t: 50 # number of threads
# Set memory and threads for medium demanding rules
medium:
mem_mb: 204800 # memory in MB
t: 20 # number of threads
# Set memory and threads for low demanding rules
low:
mem_mb: 51200 # memory in MB
t: 8 # number of threads
# Path to hifi reads
hifi_path: "/sanhome2/Argania_assembly/ChAssembly24/rawdata/PB/"
# Path to hic reads
hic_path: "/sanhome2/Argania_assembly/ChAssembly24/rawdata/Hic/BMK240627-CC766-ZX01-0101/BMK_DATA_20240913145637_1/Data/"
# Customisable parameters for kmc
kmc:
k: 27 # kmer size, it will be the same used for genomescope2
ci: 1 # exclude k-mers occurring less than <value> times (default: 2)
cs: 1000000 #maximal value of a counter (default: 255)
# Customisable parameters for kmc_tools transform
kmc_tools:
cx: 1000000 # exclude k-mers occurring more of than <value> times
# Customisable parameters for genomescope2
genomescope2:
optional_params:
"-p": "2"
"-l": ""
# Customisable parameters for oatk
oatk:
k: 1001 # kmer size [1001]
c: 150 # minimum kmer coverage [3]
m: "resources/oatkDB/embryophyta_mito.fam" # mitochondria gene annotation HMM profile database [NULL]
optional_params:
"-p": "resources/oatkDB/embryophyta_pltd.fam" # to use for species that have a plastid db
# Customisable parameters for fastp
fastp:
optional_params:
"--cut_front": False # set to True for Arima Hi-C library prep kit generated data
"--cut_front_window_size": "" # set to 5 for Arima Hi-C library prep kit generated data
# Customisable parameters for hifiasm
hifiasm:
phased_assembly: False # set to true if you want to obtain a phased assembly
optional_params:
"-f": "" # used for small datasets
"-l": "" # purge level. 0: no purging; 1: light; 2/3: aggressive [0 for trio; 3 for unzip]
"--ul": "" # use this if you have also ont data you want to integrate in your assembly
#Set this to False if you want to skip the fcsgx step:
include_fcsgx: False #inlcude this rule only if you have previously downloaded the database (recommended to run fcsgx only on a HPC. It requires around 500 GB of space on your disk and a large RAM)
# Customisable parameters for fcsgx
#fcsgx:
# ncbi_tax_id: 4513
# path_to_gx_db: "path/to/fcsgx/gxdb"
# Set this to False if you want to skip purge_dups steps:
include_purge_dups: True
# Customisable parameters for arima mapping pipeline:
arima:
MAPQ_FILTER: 10
# Customisable parameters for yahs
yahs:
optional_params:
"-e": "A/AGCTT" # you can specify the restriction enzyme(s) used by the Hi-C experiment
# Customisable parameters for quast
quast:
optional_params:
"--fragmented": ""
"--large": ""
# "-r": "resources/reference_genomes/yeast/Saccharomyces_cerevisiae.R64-1-1.dna.toplevel.fa" #reference genome (fasta)
# "-g": "resources/reference_genomes/yeast/Saccharomyces_cerevisiae.R64-1-1.101.gff3" # reference features (gff)
# Customisable parameters for busco
busco:
lineage: "resources/busco_db/embryophyta_odb10.2024-01-08.tar.gz" # lineage to be used for busco analysis
optional_params:
"--metaeuk": "" # this can be set to True if needed. The default is miniprot
here are my hic and PB files PB :merged_cells.fq.gz Hic : Unknown_CC766-004H0001_good_1.fq.gz Unknown_CC766-004H0001_good_2.fq.gz Do I need to rename them?
Yes, please, use .fastq.gz
rather than .fq.gz
Also, for -e
in YaHS parameters, I think you should use ,
and not /
to list the enzymes
here is the output of the ruaning snakemake --cores all --use-conda --configfile config_argania.yaml
Config file config/config.yaml is extended by additional config specified via the command line.
Assuming unrestricted shared filesystem usage.
host: inra
Building DAG of jobs...
/bin/bash: conda: command not found
Traceback (most recent call last):
File "/home/inra/miniconda3/envs/snakemake/lib/python3.12/site-packages/snakemake/cli.py", line 2095, in args_to_api
dag_api.execute_workflow(
File "/home/inra/miniconda3/envs/snakemake/lib/python3.12/site-packages/snakemake/api.py", line 595, in execute_workflow
workflow.execute(
File "/home/inra/miniconda3/envs/snakemake/lib/python3.12/site-packages/snakemake/workflow.py", line 1164, in execute
self.dag.create_conda_envs()
File "/home/inra/miniconda3/envs/snakemake/lib/python3.12/site-packages/snakemake/dag.py", line 454, in create_conda_envs
env.create(self.workflow.dryrun)
File "/home/inra/miniconda3/envs/snakemake/lib/python3.12/site-packages/snakemake/deployment/conda.py", line 384, in create
if self.pin_file:
^^^^^^^^^^^^^
File "/home/inra/miniconda3/envs/snakemake/lib/python3.12/site-packages/snakemake_interface_common/utils.py", line 33, in __get__
value = self.method(instance)
^^^^^^^^^^^^^^^^^^^^^
File "/home/inra/miniconda3/envs/snakemake/lib/python3.12/site-packages/snakemake/deployment/conda.py", line 102, in pin_file
f".{self.conda.platform}.pin.txt"
^^^^^^^^^^
File "/home/inra/miniconda3/envs/snakemake/lib/python3.12/site-packages/snakemake_interface_common/utils.py", line 33, in __get__
value = self.method(instance)
^^^^^^^^^^^^^^^^^^^^^
File "/home/inra/miniconda3/envs/snakemake/lib/python3.12/site-packages/snakemake/deployment/conda.py", line 95, in conda
return Conda(
^^^^^^
File "/home/inra/miniconda3/envs/snakemake/lib/python3.12/site-packages/snakemake/deployment/conda.py", line 654, in __init__
shell.check_output(self._get_cmd("conda info --json"), text=True)
File "/home/inra/miniconda3/envs/snakemake/lib/python3.12/site-packages/snakemake/shell.py", line 64, in check_output
return sp.check_output(cmd, shell=True, executable=executable, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/inra/miniconda3/envs/snakemake/lib/python3.12/subprocess.py", line 466, in check_output
return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/inra/miniconda3/envs/snakemake/lib/python3.12/subprocess.py", line 571, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command 'conda info --json' returned non-zero exit status 127.
@SolayMane you should install mamba (preferably) or conda and Snakemake first, as all the environments are created through conda.
/bin/bash: conda: command not found
suggests to me that you are not in a conda environment. If you installed mamba or conda, maybe you didn't run init
.
Please read the Usage section carefully: https://github.com/LiaOb21/colora?tab=readme-ov-file#usage
And, again, if you can, you should test the pipeline first.
Here is the log of the recenent error :
host: inra
Building DAG of jobs...
Your conda installation is not configured to use strict channel priorities. This is however crucial for having robust and correct environments (for details, see https://conda-forge.org/docs/user/tipsandtricks.html). Please consider to configure strict priorities by executing 'conda config --set channel_priority strict'.
Using shell: /bin/bash
Provided cores: 56
Rules claiming more threads will be scaled down.
Job stats:
job count
--------------------- -------
all 1
bandage_pltd 1
busco 1
bwa_index 1
bwa_mem 1
fiter_five_end 1
gfastats_pltd 1
hifiasm 1
nanoplot 1
oatk_pltd 1
picard 1
purge_dups 1
purge_dups_alt 1
quast 1
two_read_bam_combiner 1
yahs 1
total 16
Select jobs to execute...
Execute 1 jobs...
[Wed Oct 9 09:18:20 2024]
localrule hifiasm:
input: results/reads/hifi/hifi.fastq.gz
output: results/hifiasm/asm.primary.gfa, results/hifiasm/asm.alternate.gfa, results/hifiasm/asm.primary.fa, results/hifiasm/asm.alternate.fa, results/assemblies/asm_primary.fa
log: logs/hifiasm.log
jobid: 6
reason: Missing output files: results/assemblies/asm_primary.fa, results/hifiasm/asm.alternate.fa, results/hifiasm/asm.primary.fa
threads: 50
resources: tmpdir=/tmp, mem_mb=409600, mem_mib=390625
Activating conda environment: .snakemake/conda/ae9a949f10685e275e7788b0e2db316a_
[Thu Oct 10 05:52:11 2024]
Error in rule hifiasm:
jobid: 6
input: results/reads/hifi/hifi.fastq.gz
output: results/hifiasm/asm.primary.gfa, results/hifiasm/asm.alternate.gfa, results/hifiasm/asm.primary.fa, results/hifiasm/asm.alternate.fa, results/assemblies/asm_primary.fa
log: logs/hifiasm.log (check log file(s) for error details)
conda-env: /sanhome2/Argania_assembly/ChAssembly24/Assembly_Colora/.snakemake/conda/ae9a949f10685e275e7788b0e2db316a_
shell:
hifiasm results/reads/hifi/hifi.fastq.gz -t 50 -o results/hifiasm/asm --primary >> logs/hifiasm.log 2>&1
mv results/hifiasm/asm.p_ctg.gfa results/hifiasm/asm.primary.gfa
mv results/hifiasm/asm.a_ctg.gfa results/hifiasm/asm.alternate.gfa
awk -f scripts/gfa_to_fasta.awk < results/hifiasm/asm.primary.gfa > results/hifiasm/asm.primary.fa
awk -f scripts/gfa_to_fasta.awk < results/hifiasm/asm.alternate.gfa > results/hifiasm/asm.alternate.fa
# all the assemblies produced by the workflow will be symlinked to results/assemblies
ln -srn results/hifiasm/asm.primary.fa results/assemblies/asm_primary.fa
(one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
Removing output files of failed job hifiasm since they might be corrupted:
results/hifiasm/asm.primary.gfa, results/hifiasm/asm.alternate.gfa, results/hifiasm/asm.primary.fa
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: .snakemake/log/2024-10-09T091814.115222.snakemake.log
WorkflowError:
At least one job did not complete successfully.
Hi @SolayMane,
Can you please check the hifiasm.log
file? It should be in the logs
directory.
here is the tail -n 100 of the file :
[M::ha_pt_gen::] counting in normal mode
[M::yak_count] collected 2077326591 minimizers
[M::ha_pt_gen::66872.637*45.27] ==> indexed 2074465458 positions, counted 23694655 distinct minimizer k-mers
[M::ha_assemble::70078.091*45.48@145.364GB] ==> found overlaps for the final round
[M::ha_print_ovlp_stat] # overlaps: 675197376
[M::ha_print_ovlp_stat] # strong overlaps: 486200620
[M::ha_print_ovlp_stat] # weak overlaps: 188996756
[M::ha_print_ovlp_stat] # exact overlaps: 660989699
[M::ha_print_ovlp_stat] # inexact overlaps: 14207677
[M::ha_print_ovlp_stat] # overlaps without large indels: 674209617
[M::ha_print_ovlp_stat] # reverse overlaps: 393805665
[M::ha_opt_update_cov_min] updated max_n_chain to 645
Writing reads to disk...
Reads has been written.
Writing ma_hit_ts to disk...
ma_hit_ts has been written.
Writing ma_hit_ts to disk...
ma_hit_ts has been written.
bin files have been written.
[M::purge_dups] homozygous read coverage threshold: 128
[M::purge_dups] purge duplication coverage threshold: 161
[M::ug_ext_gfa::] # tips::338
Writing raw unitig GFA to disk...
[M::ug_ext_gfa::] # tips::1
Writing processed unitig GFA to disk...
[M::purge_dups] homozygous read coverage threshold: 128
[M::purge_dups] purge duplication coverage threshold: 161
[M::mc_solve:: # edges: 1094]
[M::mc_solve_core_adv::0.194] ==> Partition
[M::adjust_utg_by_primary] primary contig coverage range: [108, infinity]
Writing primary contig GFA to disk...
Writing alternate contig GFA to disk...
Inconsistency threshold for low-quality regions in BED files: 70%
[M::main] Version: 0.19.9-r616
[M::main] CMD: hifiasm -t 50 -o results/hifiasm/asm --primary results/reads/hifi/hifi.fastq.gz
[M::main] Real time: 74019.555 sec; CPU: 3194245.652 sec; Peak RSS: 145.364 GB
That's strange, from the log file it seems that hifiasm completed successfully. Can I see your config file? Did you try the test workflow? Did it complete successfully?
I didn't try the test. In the log file there is a line awk -f scripts/gfa_to_fasta.awk
, where I should have the folder scripts?
here is the config:
# config.yaml for real data
# Set memory and threads for high demanding rules
high:
mem_mb: 409600 # memory in MB
t: 50 # number of threads
# Set memory and threads for medium demanding rules
medium:
mem_mb: 204800 # memory in MB
t: 20 # number of threads
# Set memory and threads for low demanding rules
low:
mem_mb: 51200 # memory in MB
t: 8 # number of threads
# Path to hifi reads
hifi_path: "/sanhome2/Argania_assembly/ChAssembly24/rawdata/PB/"
# Path to hic reads
hic_path: "/sanhome2/Argania_assembly/ChAssembly24/rawdata/Hic/BMK240627-CC766-ZX01-0101/BMK_DATA_20240913145637_1/Data/"
# Customisable parameters for kmc
kmc:
k: 27 # kmer size, it will be the same used for genomescope2
ci: 1 # exclude k-mers occurring less than <value> times (default: 2)
cs: 1000000 #maximal value of a counter (default: 255)
# Customisable parameters for kmc_tools transform
kmc_tools:
cx: 1000000 # exclude k-mers occurring more of than <value> times
# Customisable parameters for genomescope2
genomescope2:
optional_params:
"-p": "2"
"-l": ""
# Customisable parameters for oatk
oatk:
k: 1001 # kmer size [1001]
c: 150 # minimum kmer coverage [3]
m: "resources/oatkDB/embryophyta_mito.fam" # mitochondria gene annotation HMM profile database [NULL]
optional_params:
"-p": "resources/oatkDB/embryophyta_pltd.fam" # to use for species that have a plastid db
# Customisable parameters for fastp
fastp:
optional_params:
"--cut_front": False # set to True for Arima Hi-C library prep kit generated data
"--cut_front_window_size": "" # set to 5 for Arima Hi-C library prep kit generated data
# Customisable parameters for hifiasm
hifiasm:
phased_assembly: False # set to true if you want to obtain a phased assembly
optional_params:
"-f": "" # used for small datasets
"-l": "" # purge level. 0: no purging; 1: light; 2/3: aggressive [0 for trio; 3 for unzip]
"--ul": "" # use this if you have also ont data you want to integrate in your assembly
#Set this to False if you want to skip the fcsgx step:
include_fcsgx: False #inlcude this rule only if you have previously downloaded the database (recommended to run fcsgx only on a HPC. It requires around 500 GB of space on your disk and a large RAM)
# Customisable parameters for fcsgx
#fcsgx:
# ncbi_tax_id: 4513
# path_to_gx_db: "path/to/fcsgx/gxdb"
# Set this to False if you want to skip purge_dups steps:
include_purge_dups: True
# Customisable parameters for arima mapping pipeline:
arima:
MAPQ_FILTER: 10
# Customisable parameters for yahs
yahs:
optional_params:
"-e": "A/AGCTT" # you can specify the restriction enzyme(s) used by the Hi-C experiment
# Customisable parameters for quast
quast:
optional_params:
"--fragmented": ""
"--large": ""
# "-r": "resources/reference_genomes/yeast/Saccharomyces_cerevisiae.R64-1-1.dna.toplevel.fa" #reference genome (fasta)
# "-g": "resources/reference_genomes/yeast/Saccharomyces_cerevisiae.R64-1-1.101.gff3" # reference features (gff)
# Customisable parameters for busco
busco:
lineage: "resources/busco_db/embryophyta_odb10.2024-01-08.tar.gz" # lineage to be used for busco analysis
optional_params:
"--metaeuk": "" # this can be set to True if needed. The default is miniprot
Your config.yaml
looks okay. Apart from the restriction enzymes in YaHS as said previously.
The script directory is in colora/scripts
. Do you see that directory?
If it can help you, there is a tutorial available on YouTube now: https://youtu.be/-xWgvj_PmZo?si=tGMy0ZyNOJRSQmVs
If you could test the workflow, we can understand if it is a pipeline-related issue or if there is something else.
Okay, that's why then. Can you go into the directory where you downloaded colora and type tree .
, please?
I run the test and I got several errors... actually I'm on this error :
host: inra
Building DAG of jobs...
Your conda installation is not configured to use strict channel priorities. This is however crucial for having robust and correct environments (for details, see https://conda-forge.org/docs/user/tipsandtricks.html). Please consider to configure strict priorities by executing 'conda config --set channel_priority strict'.
Using shell: /bin/bash
Provided cores: 56
Rules claiming more threads will be scaled down.
Job stats:
job count
-------- -------
all 1
busco 1
nanoplot 1
quast 1
yahs 2
total 6
Select jobs to execute...
Execute 3 jobs...
[Mon Oct 14 10:21:17 2024]
localrule yahs:
input: results/bwa_index_hap2/asm.fa, results/arima_mapping_pipeline_hap2/REP_DIR/paired_mark_dups_final.bam
output: results/yahs_hap2/asm_yahs_scaffolds_final.fa, results/assemblies/yahs_hap2.fa
log: logs/yahs_hap2.log
jobid: 14
reason: Missing output files: results/assemblies/yahs_hap2.fa, results/yahs_hap2/asm_yahs_scaffolds_final.fa
wildcards: hap=hap2
resources: tmpdir=/tmp, mem_mb=8000, mem_mib=7630
Activating conda environment: .snakemake/conda/87af61d5b13247599818da2121007351_
[Mon Oct 14 10:21:17 2024]
localrule yahs:
input: results/bwa_index_hap1/asm.fa, results/arima_mapping_pipeline_hap1/REP_DIR/paired_mark_dups_final.bam
output: results/yahs_hap1/asm_yahs_scaffolds_final.fa, results/assemblies/yahs_hap1.fa
log: logs/yahs_hap1.log
jobid: 8
reason: Missing output files: results/assemblies/yahs_hap1.fa, results/yahs_hap1/asm_yahs_scaffolds_final.fa
wildcards: hap=hap1
resources: tmpdir=/tmp, mem_mb=8000, mem_mib=7630
Activating conda environment: .snakemake/conda/87af61d5b13247599818da2121007351_
[Mon Oct 14 10:21:17 2024]
localrule nanoplot:
input: results/reads/hifi/hifi.fastq.gz
output: results/nanoplot/NanoPlot-report.html
log: logs/nanoplot.log
jobid: 1
reason: Missing output files: results/nanoplot/NanoPlot-report.html
threads: 4
resources: tmpdir=/tmp, mem_mb=8000, mem_mib=7630
Activating conda environment: .snakemake/conda/5d695729f424d26c27968862726d6580_
[Mon Oct 14 10:21:18 2024]
Finished job 14.
1 of 6 steps (17%) done
[Mon Oct 14 10:21:18 2024]
Finished job 8.
2 of 6 steps (33%) done
Select jobs to execute...
Execute 2 jobs...
[Mon Oct 14 10:21:18 2024]
localrule busco:
input: results/assemblies/asm_hap1.fa, results/assemblies/yahs_hap1.fa, results/assemblies/asm_hap2.fa, results/assemblies/yahs_hap2.fa
output: results/busco
log: logs/busco.log
jobid: 20
reason: Missing output files: results/busco; Input files updated by another job: results/assemblies/yahs_hap1.fa, results/assemblies/yahs_hap2.fa
threads: 4
resources: tmpdir=/tmp, mem_mb=32000, mem_mib=30518
Activating conda environment: .snakemake/conda/76be1be47196d1e717e47c9f575fe602_
[Mon Oct 14 10:21:18 2024]
localrule quast:
input: results/assemblies/asm_hap1.fa, results/assemblies/yahs_hap1.fa, results/assemblies/asm_hap2.fa, results/assemblies/yahs_hap2.fa
output: results/quast
log: logs/quast.log
jobid: 5
reason: Missing output files: results/quast; Input files updated by another job: results/assemblies/yahs_hap1.fa, results/assemblies/yahs_hap2.fa
threads: 4
resources: tmpdir=/tmp, mem_mb=8000, mem_mib=7630
Activating conda environment: .snakemake/conda/3cb2bb5b7d165a8674c17e132df58bb9_
[Mon Oct 14 10:21:27 2024]
Finished job 5.
3 of 6 steps (50%) done
[Mon Oct 14 10:21:39 2024]
Error in rule nanoplot:
jobid: 1
input: results/reads/hifi/hifi.fastq.gz
output: results/nanoplot/NanoPlot-report.html
log: logs/nanoplot.log (check log file(s) for error details)
conda-env: /sanhome2/Argania_assembly/ChAssembly24/Assembly_Colora/.snakemake/conda/5d695729f424d26c27968862726d6580_
shell:
NanoPlot -t 4 --fastq results/reads/hifi/hifi.fastq.gz --loglength -o results/nanoplot --plots dot --verbose >> logs/nanoplot.log 2>&1
(one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
Removing output files of failed job nanoplot since they might be corrupted:
results/nanoplot/NanoPlot-report.html
[Mon Oct 14 10:26:05 2024]
Finished job 20.
4 of 6 steps (67%) done
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: .snakemake/log/2024-10-14T102110.537983.snakemake.log
WorkflowError:
At least one job did not complete successfully.
the command NanoPlot -t 4 --fastq results/reads/hifi/hifi.fastq.gz --loglength -o results/nanoplot --plots dot --verbose >> logs/nanoplot.log 2>&1
works fine but still raising error.
So, one first thing that you could try is to set conda config --set channel_priority strict
from the conda environment. I never got an error with NanoPlot in any of the machines in which I run the workflow, so there might be some issue with how that specific environment is created.
Can you please go to /sanhome2/Argania_assembly/ChAssembly24/Assembly_Colora/
and type tree .
? I want to check the directory structure.
One more thing, could you check the log of NanoPlot in the logs
directory to see what exactly the error is?
However, if only NanoPlot is giving errors, you should have your assembly completed. NanoPlot is used to QC long reads.
here is the log of nanoplot
2024-10-14 10:44:37,453 NanoPlot 1.43.0 started with arguments Namespace(threads=20, verbose=True, store=False, raw=False, huge=False, outdir='results/nanoplot', no_static=False, prefix='', tsv_stats=False, only_report=False, info_in_report=False, maxlength=None, minlength=None, drop_outliers=False, downsample=None, loglength=True, percentqual=False, alength=False, minqual=None, runtime_until=None, readtype='1D', barcoded=False, no_supplementary=False, color='#4CB391', colormap='Greens', format=['png'], plots=['dot'], legacy=None, listcolors=False, listcolormaps=False, no_N50=False, N50=False, title=None, font_scale=1, dpi=100, hide_stats=False, fastq=['results/reads/hifi/hifi.fastq.gz'], fasta=None, fastq_rich=None, fastq_minimal=None, summary=None, bam=None, ubam=None, cram=None, pickle=None, feather=None, path='results/nanoplot/')
2024-10-14 10:44:37,453 Python version is: 3.12.5 | packaged by conda-forge | (main, Aug 8 2024, 18:36:51) [GCC 12.4.0]
2024-10-14 10:44:37,467 Nanoget: Starting to collect statistics from plain fastq file.
2024-10-14 10:44:37,467 Nanoget: Decompressing gzipped fastq results/reads/hifi/hifi.fastq.gz
2024-10-14 12:09:30,768 Reduced DataFrame memory usage from 72.1270637512207Mb to 72.1270637512207Mb
2024-10-14 12:09:31,070 Nanoget: Gathered all metrics of 4726911 reads
2024-10-14 12:09:35,157 Calculated statistics
2024-10-14 12:09:35,161 Using sequenced read lengths for plotting.
2024-10-14 12:09:35,399 Using log10 scaled read lengths.
2024-10-14 12:09:35,937 NanoPlot: Valid color #4CB391.
2024-10-14 12:09:35,937 NanoPlot: Valid colormap Greens.
2024-10-14 12:09:36,278 NanoPlot: Creating length plots for Read length.
2024-10-14 12:09:36,283 NanoPlot: Using 4726911 reads maximum of 68733bp.
2024-10-14 12:09:37,561 Saved results/nanoplot/WeightedHistogramReadlength as png (or png for --legacy)
2024-10-14 12:09:39,475 Saved results/nanoplot/WeightedLogTransformed_HistogramReadlength as png (or png for --legacy)
2024-10-14 12:09:40,736 Saved results/nanoplot/Non_weightedHistogramReadlength as png (or png for --legacy)
2024-10-14 12:09:42,438 Saved results/nanoplot/Non_weightedLogTransformed_HistogramReadlength as png (or png for --legacy)
2024-10-14 12:09:42,670 A global iterator flag was passed as a per-operand flag to the iterator constructor
Traceback (most recent call last):
File "/sanhome2/Argania_assembly/ChAssembly24/Assembly_Colora/.snakemake/conda/5d695729f424d26c27968862726d6580_/lib/python3.12/site-packages/nanoplot/NanoPlot.py", line 110, in main
plots = make_plots(datadf, settings)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/sanhome2/Argania_assembly/ChAssembly24/Assembly_Colora/.snakemake/conda/5d695729f424d26c27968862726d6580_/lib/python3.12/site-packages/nanoplot/NanoPlot.py", line 166, in make_plots
nanoplotter.length_plots(
File "/sanhome2/Argania_assembly/ChAssembly24/Assembly_Colora/.snakemake/conda/5d695729f424d26c27968862726d6580_/lib/python3.12/site-packages/nanoplotter/nanoplotter_main.py", line 510, in length_plots
yield_by_minimal_length_plot(
File "/sanhome2/Argania_assembly/ChAssembly24/Assembly_Colora/.snakemake/conda/5d695729f424d26c27968862726d6580_/lib/python3.12/site-packages/nanoplotter/nanoplotter_main.py", line 559, in yield_by_minimal_length_plot
df["cumyield_gb"] = df["lengths"].cumsum() / 10**9
~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~
File "/sanhome2/Argania_assembly/ChAssembly24/Assembly_Colora/.snakemake/conda/5d695729f424d26c27968862726d6580_/lib/python3.12/site-packages/pandas/core/ops/common.py", line 76, in new_method
return method(self, other)
^^^^^^^^^^^^^^^^^^^
File "/sanhome2/Argania_assembly/ChAssembly24/Assembly_Colora/.snakemake/conda/5d695729f424d26c27968862726d6580_/lib/python3.12/site-packages/pandas/core/arraylike.py", line 210, in __truediv__
return self._arith_method(other, operator.truediv)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/sanhome2/Argania_assembly/ChAssembly24/Assembly_Colora/.snakemake/conda/5d695729f424d26c27968862726d6580_/lib/python3.12/site-packages/pandas/core/series.py", line 6135, in _arith_method
return base.IndexOpsMixin._arith_method(self, other, op)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/sanhome2/Argania_assembly/ChAssembly24/Assembly_Colora/.snakemake/conda/5d695729f424d26c27968862726d6580_/lib/python3.12/site-packages/pandas/core/base.py", line 1382, in _arith_method
result = ops.arithmetic_op(lvalues, rvalues, op)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/sanhome2/Argania_assembly/ChAssembly24/Assembly_Colora/.snakemake/conda/5d695729f424d26c27968862726d6580_/lib/python3.12/site-packages/pandas/core/ops/array_ops.py", line 283, in arithmetic_op
res_values = _na_arithmetic_op(left, right, op) # type: ignore[arg-type]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/sanhome2/Argania_assembly/ChAssembly24/Assembly_Colora/.snakemake/conda/5d695729f424d26c27968862726d6580_/lib/python3.12/site-packages/pandas/core/ops/array_ops.py", line 218, in _na_arithmetic_op
result = func(left, right)
^^^^^^^^^^^^^^^^^
File "/sanhome2/Argania_assembly/ChAssembly24/Assembly_Colora/.snakemake/conda/5d695729f424d26c27968862726d6580_/lib/python3.12/site-packages/pandas/core/computation/expressions.py", line 242, in evaluate
return _evaluate(op, op_str, a, b) # type: ignore[misc]
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/sanhome2/Argania_assembly/ChAssembly24/Assembly_Colora/.snakemake/conda/5d695729f424d26c27968862726d6580_/lib/python3.12/site-packages/pandas/core/computation/expressions.py", line 73, in _evaluate_standard
return op(a, b)
^^^^^^^^
ValueError: A global iterator flag was passed as a per-operand flag to the iterator constructor
If you read this then NanoPlot 1.43.0 has crashed :-(
Please try updating NanoPlot and see if that helps...
If not, please report this issue at https://github.com/wdecoster/NanoPlot/issues
If you could include the log file that would be really helpful.
Thanks!
Traceback (most recent call last):
File "/sanhome2/Argania_assembly/ChAssembly24/Assembly_Colora/.snakemake/conda/5d695729f424d26c27968862726d6580_/bin/NanoPlot", line 10, in <module>
sys.exit(main())
^^^^^^
File "/sanhome2/Argania_assembly/ChAssembly24/Assembly_Colora/.snakemake/conda/5d695729f424d26c27968862726d6580_/lib/python3.12/site-packages/nanoplot/NanoPlot.py", line 110, in main
plots = make_plots(datadf, settings)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/sanhome2/Argania_assembly/ChAssembly24/Assembly_Colora/.snakemake/conda/5d695729f424d26c27968862726d6580_/lib/python3.12/site-packages/nanoplot/NanoPlot.py", line 166, in make_plots
nanoplotter.length_plots(
File "/sanhome2/Argania_assembly/ChAssembly24/Assembly_Colora/.snakemake/conda/5d695729f424d26c27968862726d6580_/lib/python3.12/site-packages/nanoplotter/nanoplotter_main.py", line 510, in length_plots
yield_by_minimal_length_plot(
File "/sanhome2/Argania_assembly/ChAssembly24/Assembly_Colora/.snakemake/conda/5d695729f424d26c27968862726d6580_/lib/python3.12/site-packages/nanoplotter/nanoplotter_main.py", line 559, in yield_by_minimal_length_plot
df["cumyield_gb"] = df["lengths"].cumsum() / 10**9
~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~
File "/sanhome2/Argania_assembly/ChAssembly24/Assembly_Colora/.snakemake/conda/5d695729f424d26c27968862726d6580_/lib/python3.12/site-packages/pandas/core/ops/common.py", line 76, in new_method
return method(self, other)
^^^^^^^^^^^^^^^^^^^
File "/sanhome2/Argania_assembly/ChAssembly24/Assembly_Colora/.snakemake/conda/5d695729f424d26c27968862726d6580_/lib/python3.12/site-packages/pandas/core/arraylike.py", line 210, in __truediv__
return self._arith_method(other, operator.truediv)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/sanhome2/Argania_assembly/ChAssembly24/Assembly_Colora/.snakemake/conda/5d695729f424d26c27968862726d6580_/lib/python3.12/site-packages/pandas/core/series.py", line 6135, in _arith_method
return base.IndexOpsMixin._arith_method(self, other, op)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/sanhome2/Argania_assembly/ChAssembly24/Assembly_Colora/.snakemake/conda/5d695729f424d26c27968862726d6580_/lib/python3.12/site-packages/pandas/core/base.py", line 1382, in _arith_method
result = ops.arithmetic_op(lvalues, rvalues, op)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/sanhome2/Argania_assembly/ChAssembly24/Assembly_Colora/.snakemake/conda/5d695729f424d26c27968862726d6580_/lib/python3.12/site-packages/pandas/core/ops/array_ops.py", line 283, in arithmetic_op
res_values = _na_arithmetic_op(left, right, op) # type: ignore[arg-type]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/sanhome2/Argania_assembly/ChAssembly24/Assembly_Colora/.snakemake/conda/5d695729f424d26c27968862726d6580_/lib/python3.12/site-packages/pandas/core/ops/array_ops.py", line 218, in _na_arithmetic_op
result = func(left, right)
^^^^^^^^^^^^^^^^^
File "/sanhome2/Argania_assembly/ChAssembly24/Assembly_Colora/.snakemake/conda/5d695729f424d26c27968862726d6580_/lib/python3.12/site-packages/pandas/core/computation/expressions.py", line 242, in evaluate
return _evaluate(op, op_str, a, b) # type: ignore[misc]
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/sanhome2/Argania_assembly/ChAssembly24/Assembly_Colora/.snakemake/conda/5d695729f424d26c27968862726d6580_/lib/python3.12/site-packages/pandas/core/computation/expressions.py", line 73, in _evaluate_standard
return op(a, b)
^^^^^^^^
ValueError: A global iterator flag was passed as a per-operand flag to the iterator constructor
Yeah, it could be a problem with the environment!
You can try this:
rm -r /sanhome2/Argania_assembly/ChAssembly24/Assembly_Colora/.snakemake/conda/5d695729f424d26c27968862726d6580_*
conda config --set channel_priority strict
Then, to resume the workflow, go back to /sanhome2/Argania_assembly/ChAssembly24/Assembly_Colora/
and use the same snakemake command that you used to start the workflow originally. It should download NanoPlot again and create a new environment for that package. Let me know if in this way you manage to complete it.
I setup my config file and I run
snakemake --cores all --use-conda --configfile config_argania.yaml
here is the error : Config file config/config.yaml is extended by additional config specified via the command line. ValueError in file https://raw.githubusercontent.com/LiaOb21/colora/Colora_v1.1.0/workflow/Snakefile, line 11: not enough values to unpack (expected 1, got 0) File "https://raw.githubusercontent.com/LiaOb21/colora/Colora_v1.1.0/workflow/Snakefile", line 11, in