Open davidecarlson opened 3 years ago
It looks like an earlier error before the assembly step or the assembly failed. Could you post your config.json file?
Thanks for the response. Here is my config.json file:
{
"draft_genome": {
"fa": "/datahome/oenothera/assembly/bionano_results_axel/cur_results_1297259/canu_bionano_scaffolds_and_contigs.fasta"
},
"raw_reads": [
{
"left": "/datahome/oenothera/genomic/Illumina_PE/elata/HI.0553.002.Index_7.johst_DNA_R1.fastq",
"right": "/datahome/oenothera/genomic/Illumina_PE/elata/HI.0553.002.Index_7.johst_DNA_R2.fastq"
},
{
"left": "/datahome/oenothera/genomic/Illumina_MP-NEW/elata_MP_nxtrim_R1.mp.fastq",
"right": "/datahome/oenothera/genomic/Illumina_MP-NEW/elata_MP_nxtrim_R2.mp.fastq"
}
],
"alignments": [
{
"bam": "/datahome/oenothera/assembly/bionano_results_axel/cur_results_1297259/gappadder/processed/elataMP.sorted.markdup.bam",
"is": "8178",
"std": "853"
},
{
"bam": "/datahome/oenothera/assembly/bionano_results_axel/cur_results_1297259/gappadder/processed/elataPE.sorted.markdup.bam",
"is": "282",
"std": "19"
}
],
"software_path": {
"bwa": "bwa",
"samtools": "samtools",
"velvet": "/home/progs/velvet",
"kmc": "kmc",
"TERefiner": "/home/progs/GAPPadder/TERefiner_1",
"ContigsMerger": "/home/progs/GAPPadder/ContigsMerger"
},
"parameters": {
"working_folder": "/datahome/oenothera/assembly/bionano_results_axel/cur_results_1297259/gappadder/results",
"min_gap_size": "50",
"flank_length": "300",
"nthreads": "40",
"verbose": "1"
},
"kmer_length": [{
"k": 30,
"k_velvet": [{
"k": 29
},
{
"k": 27
}]
},
{
"k": 40,
"k_velvet": [{
"k": 39
},
{
"k": 37
}]
},
{
"k": 50,
"k_velvet": [{
"k": 49
},
{
"k": 47
}]
}]
}
Let me know if you need any additional info. Thanks! Dave
The config looks good for me. Would you please try to change velvet
and kmc
to the path of absolute folder? Like
"velvet": "/gpfs/scratchfs1/chc12015/tools/velvet-master/",
"kmc": "/gpfs/scratchfs1/chc12015/tools/kmc2.3/",
Thanks, Simon. I changed the kmc path in the config file to the absolute path of the folder that contains the kmc binary (the velvet path in the config was was already the absolute path to the directory containing the velvet binaries). I then reran the Preprocess and Collect steps, which once again finished without producing any error messages.
However, when I start the Assembly step it once again fails with the same error:
First round assembly and merger...
Start merging...
Traceback (most recent call last):
File "./main.py", line 283, in <module>
main_func(scommand,sfconfig)
File "./main.py", line 274, in main_func
gap_assembler.assemble_pipeline()
File "/home/progs/GAPPadder/assemble_gaps.py", line 339, in assemble_pipeline
id_remain=self.pick_already_constructed(contigs_select, fa_list, sf_picked)
File "/home/progs/GAPPadder/assemble_gaps.py", line 321, in pick_already_constructed
m_picked=contigs_select.get_already_picked(sf_picked)
File "/home/progs/GAPPadder/pick_contigs.py", line 576, in get_already_picked
with open(sf_picked) as fin_picked:
IOError: [Errno 2] No such file or directory: u'/datahome/oenothera/assembly/bionano_results_axel/cur_results_1297259/gappadder/results/merged/../picked_seqs.fa'
I should note that the "merged" directory in my results contains nothing but empty subdirectories:
ls -l merged
total 0
drwxrwxr-x. 1 davecarlson davecarlson 0 Dec 16 10:12 both_unmapped
drwxrwxr-x. 1 davecarlson davecarlson 0 Dec 16 13:40 empty_dir
drwxrwxr-x. 1 davecarlson davecarlson 0 Dec 16 10:12 gap_reads
drwxrwxr-x. 1 davecarlson davecarlson 0 Dec 16 13:32 gap_reads_alignment
drwxrwxr-x. 1 davecarlson davecarlson 0 Dec 16 10:12 gap_reads_for_alignment
drwxrwxr-x. 1 davecarlson davecarlson 0 Dec 16 10:12 gap_reads_high_quality
drwxrwxr-x. 1 davecarlson davecarlson 0 Dec 16 13:40 kmc_temp
drwxrwxr-x. 1 davecarlson davecarlson 0 Dec 16 13:40 kmers
drwxrwxr-x. 1 davecarlson davecarlson 0 Dec 16 13:40 temp
drwxrwxr-x. 1 davecarlson davecarlson 0 Dec 16 10:12 unmapped_reads
drwxrwxr-x. 1 davecarlson davecarlson 0 Dec 16 13:40 velvet_temp
Any other suggestions for things I should be changing? Thanks, Dave
Could you check whether /home/progs/GAPPadder/ContigsMerger
and /home/progs/GAPPadder/TERefiner_1
run properly? Did you compile them or directly use the one contained? On some machines, we need to re-compile them.
Hi Simon,
I used the versions bundled with GAPPadder. It's a little hard to say if they're working properly. Here is the output for ContigsMerger:
Arrange error! 0 6
The output for TERefiner_1:
Please check parameters setting!
Are these the expected output when run with no input?
Hi Simon,
I have tried to use GAPPadder and I am getting exactly the same issues (program not finishing and outputing empty directories) as brought up by Dave before in this ticket.
Below are the different infos to trace back:
#!/bin/bash
#SBATCH --mail-type=end,fail
#SBATCH --job-name="gap"
#SBATCH --nodes=1
#SBATCH --cpus-per-task=12
#SBATCH --time=12:00:00
#SBATCH --mem=32G
#SBATCH --partition=pall
#SBATCH --output=gap_%j.out
#SBATCH --error=gap_%j.err
module add UHTS/Aligner/bwa/0.7.17
module add UHTS/Analysis/samtools/1.10
module add UHTS/Assembler/velvet/1.2.10
# Preprocess the draft genome to get the gap positions and flank regions
python main.py -c Preprocess -g configuration.json
# Collect reads for each gap
python main.py -c Collect -g configuration.json
# Construct the gap sequence and pick the best one:
python main.py -c Assembly -g configuration.json
samtools view /path/2/align.bam "draft_name" | python collect_reads_for_gaps.py /path/2/gap_positions.txt 30 /path/2/1_is300/ 300 50 250 -
samtools view /path/2/align.bam "draft_name" | python collect_discordant_low_mapq_reads.py /path/2/1_is300/ -
First round assembly and merger...
Start merging...
Traceback (most recent call last):
File "main.py", line 283, in <module>
main_func(scommand,sfconfig)
File "main.py", line 257, in main_func
drc.merge_dispatch_reads_for_gaps_v2(left_reads, right_reads)
File "/path/2/run_multi_threads_discordant.py", line 213, in merge_dispatch_reads_for_gaps_v2
temp_field=id_fields[0].split("/")
IndexError: list index out of range
Traceback (most recent call last):
File "main.py", line 283, in <module>
main_func(scommand,sfconfig)
File "main.py", line 274, in main_func
gap_assembler.assemble_pipeline()
File "/path/2/assemble_gaps.py", line 339, in assemble_pipeline
id_remain=self.pick_already_constructed(contigs_select, fa_list, sf_picked)
File "/path/2/assemble_gaps.py", line 321, in pick_already_constructed
m_picked=contigs_select.get_already_picked(sf_picked)
File "/path/2/pick_contigs.py", line 576, in get_already_picked
with open(sf_picked) as fin_picked:
IOError: [Errno 2] No such file or directory: u'/path/2/merged/../picked_seqs.fa'
"draft_genome": {
"fa": "/path/2/draft.fasta"
},
"raw_reads": [
{
"left": "/path/2/reads_1.fastq.gz",
"right": "/path/2/reads_2.fastq.gz"
}
],
"alignments": [
{
"bam": "/path/2/align.bam",
"is": "300",
"std": "50"
}
],
"software_path": {
"bwa": "bwa",
"samtools": "samtools",
"velvet": "velvet",
"kmc": "/path/2/KMC/bin/",
"TERefiner": "/path/2/TERefiner_1",
"ContigsMerger": "/path/2/ContigsMerger"
},
"parameters": {
"working_folder": "/path/2/dir",
"min_gap_size": "2",
"flank_length": "300",
"nthreads": "12",
"verbose": "1"
},
"kmer_length": [{
"k": 30,
"k_velvet": [{
"k": 29
},
{
"k": 27
}]
},
{
"k": 40,
"k_velvet": [{
"k": 39
},
{
"k": 37
}]
},
{
"k": 50,
"k_velvet": [{
"k": 49
},
{
"k": 47
}]
}]
}
Would you have an idea of what is happening?
Best
Hello Simon,
I have tried to use GAPPadder as well and I have the same issues (program not finishing and output directories empty) as mentioned above. I have tried to recompile ContigsMerger and TERefiner_1, but it didn't change anything.
Do you have an idea of what is going wrong ?
Thanks, Anne
I ran the Preprocess and Collect steps according the ReadMe with no apparent errors. However, it seems like some expected output was not produced because when I run the Assembly step, I get the following error:
Here are the commands that I ran:
Any ideas what could be going wrong? Thanks! Dave