Closed A-J-F-Mackintosh closed 3 years ago
Hi Alex,
please try removing all dots (.
) from the assembly file (brenthis_ino.SP_BI_364.v1_1.contigs.fasta
). The dots are used as a separator in the hidden .anno
and .data
files and my confuse the workflow.
I will see if I can fix the workflow so it works with dots in the FASTA file names.
Should you have more issues, you may try running the small example before continuing with your real data. If you need more help, please do not hesitate to ask here.
-- Arne
Hi Arne,
Many thanks for the speedy reply.
I ran the example and it finished without any problems.
I then changed the paths (which are symlinks) in the snakemake.yml file so that they do not contain any dots.
inputs:
# The reference assembly where gaps should be closed
reference: brenthis_ino_assembly
# The set of long reads used for gap closing
reads: brenthis_ino_reads
# Type of reads. Use `PACBIO_SMRT` or `OXFORD_NANOPORE`. See README for
# more details on the subject.
reads_type: PACBIO_SMRT
outputs:
# The gap-closed reference assembly
output_assembly: brenthis_ino_dentist_assembly
This produced a new error message from snakemake.
[Mon Mar 8 21:33:04 2021]
Error in rule reference2dam:
jobid: 2
output: /scratch/amackintosh/DENTIST_02/brenthis_ino_assembly.dam,
/scratch/amackintosh/DENTIST_02/.brenthis_ino_assembly.bps, /scratch/amackintosh/DENTIST_02/.brenthis_ino_assembly.hdr,
/scratch/amackintosh/DENTIST_02/.brenthis_ino_assembly.idx
shell:
fasta2DAM /scratch/amackintosh/DENTIST_02/brenthis_ino_assembly.dam brenthis_ino_assembly && DBsplit -x1000 -a
-s200 /scratch/amackintosh/DENTIST_02/brenthis_ino_assembly.dam
(one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
[Mon Mar 8 21:33:04 2021]
Error in rule reads2db:
jobid: 8
output: /scratch/amackintosh/DENTIST_02/brenthis_ino_reads.db,
/scratch/amackintosh/DENTIST_02/.brenthis_ino_reads.bps, /scratch/amackintosh/DENTIST_02/.brenthis_ino_reads.idx
shell:
fasta2DB /scratch/amackintosh/DENTIST_02/brenthis_ino_reads.db brenthis_ino_reads && DBsplit -x1000 -a
-s200 /scratch/amackintosh/DENTIST_02/brenthis_ino_reads.db
(one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: /scratch/amackintosh/DENTIST_02/.snakemake/log/2021-03-08T213253.788128.snakemake.log
As I said before, I do not understand snakemake very well, so am not sure exactly what the error message means. A problem with fasta2DAM and fasta2DB?
Best,
Alex
Hmm, there is no specific error message in the log. Probably it was issued a bit earlier.
The easiest way of fixing things will probably be to rm -rf workdir
. At this point we don't loose anything substantial.
Hi,
I managed to fix the above issues. One problem was that the sequences in the assembly.fasta must be multi-line rather than single. The other problem was that the gzipped reads cannot be read without .gz in the filename, but the .gz causes issues because of the extra dot, so I had to unzip them.
I have now managed to run dentist successfully with a small subset of the reads (<1%).
I then tried to run dentist with the whole read set, however this causes damapper to error. This error persists when using either docker://aludi/dentist:v1.0.1
or docker://aludi/dentist:stable
.
Error in rule ref_vs_reads_alignment_block:
jobid: 415
output: /scratch/amackintosh/DENTIST_02/brenthis_ino_assembly_wrapped.brenthis_ino_reads.114.las,
/scratch/amackintosh/DENTIST_02/brenthis_ino_reads.114.brenthis_ino_assembly_wrapped.las
log: /scratch/amackintosh/DENTIST_02/ref-vs-reads-alignment.114.log (check log file(s) for error message)
shell:
{
cd /scratch/amackintosh/DENTIST_02/
damapper -C '-T32' -e0.7 -mdust -mdentist-self -mtan brenthis_ino_assembly_wrapped brenthis_ino_reads.114
LAcheck -v brenthis_ino_assembly_wrapped brenthis_ino_reads
brenthis_ino_assembly_wrapped.brenthis_ino_reads.114.las || { echo 'Check failed. Possible solutions:
Duplicate LAs: can be fixed by LAsort from 2020-03-22 or later.
In order to ignore checks entirely you may use the environment variable SKIP_LACHECK=1. Use only if you are positive the
files are in fact OK!'; (( ${SKIP_LACHECK:-0} != 0 )); }
LAcheck -v brenthis_ino_reads brenthis_ino_assembly_wrapped brenthis_ino_reads.114.brenthis_ino_assembly_wrapped.las ||
{ echo 'Check failed. Possible solutions:
Duplicate LAs: can be fixed by LAsort from 2020-03-22 or later.
In order to ignore checks entirely you may use the environment variable SKIP_LACHECK=1. Use only if you are positive the
files are in fact OK!'; (( ${SKIP_LACHECK:-0} != 0 )); }
} &> /scratch/amackintosh/DENTIST_02/ref-vs-reads-alignment.114.log
(one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
It looks like I could either change my damapper/LAsort version (not sure how), or pass the environmental variable SKIP_LACHECK=1 (through the snakemake.yaml?). What would you recommend?
Best,
Alex
I am glad that you could solve the issues. I will definitely try to build some checks and better handling for gzipped input files .
Regarding your last error: please try with SKIP_LACHECK=1
by passing it like this:
SKIP_LACHECK=1 snakemake --configfile=snakemake.yml --use-singularity --cores=32
It is most likely the source of the error even though I cannot tell because the log does not contain the error message but just the command to would issue the message. I will also try to remove the error message from the shell command as to avoid confusion.
Hi,
SKIP_LACHECK=1
allowed the analysis to complete without any problems, many thanks.
I will now start playing around with parameters to see how dentist can improve my assembly!
Thanks again for all your help,
Alex
Hi,
I am trying to use dentist for the first time but am having some trouble getting started. I am running dentist using singularity and have snakemake version 6.0.0 installed.
I downloaded the dentist.json and snakemake.yml files and edited them to include the relevant paths and also some options mentioned in the README (see below).
I first tried to validate the config files using the recommended command.
snakemake --configfile=snakemake.yml --use-singularity --cores=32 -f -- validate_dentist_config
All seemed to work fine, so I then tried to run it.
snakemake --configfile=snakemake.yml --use-singularity --cores=32
I am not used to using snakemake but I assume the missing input files are because a preceding process could not be executed. Is it possible that the problem lies within how I filled out the json and yaml files? The part of the json I edited the most looks like this (below), could any of these options being causing problems?
Any help would be really appreciated.
Best,
Alex