Closed gaworj closed 4 weeks ago
Hi Jan!
Hard to tell from looking at this. It could be an error with the script get_pbp_genes_from_contigs.py
not generating the bed file in the first place, which Nextflow tends to ignore.
Do you still have the work directory? If you do, could you print the content of the file work/51/92c5c7*/.command.log
?
Best, Vicky
Hi,
Here is the output from another problematic file (exactly the same issue - no bed file created):
(nextflow) jang@jang-MS-7B18:~/data_SSD2/genome_analysis/Weronika/GBS-Typer-sanger-nf/GBS-Typer-sanger-nf$ cat work/d1/15055fa3b53d92d5aaf81422a894e4/.command.log
Building a new DB, current time: 08/24/2022 15:59:26 New DB name: /mnt/SSD2/genome_analysis/Weronika/GBS-Typer-sanger-nf/GBS-Typer-sanger-nf/work/d1/15055fa3b53d92d5aaf81422a894e4/NIL12_contig_blastdb New DB title: NIL12.fasta Sequence type: Nucleotide Keep MBits: T Maximum file size: 1000000000B Adding sequences from FASTA; added 105 sequences in 0.0262592 seconds. mv: cannot stat 'NIL12*bed': No such file or directory
Bests, Jan
It's likely because there are no PBP genes detected in your contigs, but the expected behaviour of the pipeline should be to produce no output without errors and then would ignore the next stage: get_pbp_alleles
.
I included a clean up stage to remove some intermediate files in the last release, but didn't test this fully in the get_pbp_genes
. It will be a quick fix
This should be fixed now. Please do a git fetch
and then a git pull
If you get another error, let me know. I will keep this issue open until you run it successfully
Looks like the problem is partially solved. Please add some pipeline output report that no genes were found. I have to check my dataset and check for PBPs.
Sure. I will look into this when I'm back from leave. I guess with Strep pneumo that might be more typical than with Group B Strep. (Note that the PBP genes used are from a GBS reference database, but probably the same as what you're looking for anyway https://github.com/BenJamesMetcalf/GBS_Scripts_Reference/tree/master/GBS_Reference_DB)
Hi,
I recently did another test with sample that was already published and we are sure that it contains pbp:
nextflow run main.nf --reads 'data/*_{trim_R1,trim_R2}.fastq.gz' --output 'ERR4991741_pbp' --run_pbptyper --contigs 'data/ERR4991741_assembly.fa'
N E X T F L O W ~ version 21.04.1
Launching main.nf
[magical_noyce] - revision: 90e74631b4
executor > local (5)
[70/a2f270] process > serotyping (1) [ 0%] 0 of 1
executor > local (5)
[- ] process > serotyping (1) -
[57/3cf914] process > GBS_RES:split_target_RES_sequences [100%] 1 of 1 ✔
[b3/5ae14e] process > GBS_RES:srst2_for_res_typing (1) [100%] 1 of 1, failed: 1 ✘
[- ] process > GBS_RES:split_target_RES_seq_from_sam_file -
[- ] process > GBS_RES:freebayes -
[- ] process > OTHER_RES:srst2_for_res_typing (1) -
[- ] process > res_typer -
[- ] process > finalise_sero_res_results -
[3d/ec134e] process > get_pbp_genes (1) [100%] 1 of 1 ✔
[- ] process > PBP1A:get_pbp_alleles -
[- ] process > PBP1A:finalise_pbp_existing_allele_results -
[- ] process > PBP2B:get_pbp_alleles -
[- ] process > PBP2B:finalise_pbp_existing_allele_results -
[- ] process > PBP2X:get_pbp_alleles -
[- ] process > PBP2X:finalise_pbp_existing_allele_results -
Error executing process > 'GBS_RES:srst2_for_res_typing (1)'
Caused by:
Process GBS_RES:srst2_for_res_typing (1)
terminated with an error exit status (1)
Command executed:
srst2 --samtools_args '-A' --input_pe ERR4991741_trim_R1.fastq.gz ERR4991741_trim_R2.fastq.gz --output ERR4991741 --log --save_scores --min_coverage 99.9 --max_divergence 5 --gene_db GBS_Res_Gene-DB_Final.fasta
touch ERR4991741fullgenesGBS_Res_Gene-DB_Final__results.txt
mkdir output mv ERR4991741.bam output mv ERR4991741fullgenesGBS_Res_Gene-DB_Final__results.txt output find . -maxdepth 1 -type f -delete unlink ERR4991741_trim_R1.fastq.gz unlink ERR4991741_trim_R2.fastq.gz unlink GBS_Res_Gene-DB_Final.fasta mv output/ERR4991741.bam . mv output/ERR4991741fullgenesGBS_Res_Gene-DB_Final__results.txt . rm -d output
Command exit status: 1
Command output: (empty)
Command error: mv: cannot stat 'ERR4991741*.bam': No such file or directory
Work dir: /mnt/SSD2/genome_analysis/Weronika/GBS-Typer-sanger-nf/work/b3/5ae14e6223c19868b8cc94e49a2703
Tip: view the complete command output by changing to the process work dir and entering the command cat .command.out
Bests, Jan
Actually the get_pbp_genes part of the pipeline succeeded, and this time srst2_for_res_typing failed (this is unrelated to the PBP-specific workflow).
Could you share the output of /mnt/SSD2/genome_analysis/Weronika/GBS-Typer-sanger-nf/work/b3/5ae14e6223c19868b8cc94e49a2703/.command.out
please?
Hello,
I have successfully installed pipeline but encountered problems with analysis of my samples Fortunately I was able to perform analysis on sample data located at /GBS-Typer-sanger-nf/tree/main/tests/regression_test_data)/input_data/
Are there any specific recommendations regarding sample input such as file extension or multifasta format?
I have tried to change fasta extension into .fas or .fa. My S.pneumonie genomes were assembled using SPADes.
Here is the terminal output for one of my samples:
(nextflow) jang@jang-MS-7B18:~/data_SSD2/genome_analysis/GBS-Typer-sanger-nf/GBS-Typer-sanger-nf$ nextflow run main.nf --output 'NIL12' --run_sero_res false --run_pbptyper --contigs 'good_final/NIL12.fasta' N E X T F L O W ~ version 22.04.3 Launching
main.nf
[suspicious_fourier] DSL2 - revision: 90e74631b4 executor > local (1) [51/92c5c7] process > get_pbp_genes (1) [ 0%] 0 of 1 [- ] process > PBP1A:get_pbp_alleles - [- ] process > PBP1A:finalise_pbp_existing_allele_results - [- ] process > PBP2B:get_pbp_alleles - [- ] process > PBP2B:finalise_pbp_existing_allele_results - [- ] process > PBP2X:get_pbp_alleles - [- ] process > PBP2X:finalise_pbp_existing_allele_results - Error executing process > 'get_pbp_genes (1)'Caused by: Process
get_pbp_genes (1)
terminated with an error exit status (1)Command executed:
Build a blast reference database from the assmeblies
makeblastdb -in NIL12.fasta -dbtype nucl -out NIL12_contig_blast_db
Blast the blactam database against the blast reference database
blastn -db NIL12_contig_blast_db -query GBS_bLactam_Ref.fasta -outfmt 6 -word_size 7 -out NIL12_blast_blactam.out
Get BED file of PBP fragments
executor > local (1) [51/92c5c7] process > get_pbp_genes (1) [100%] 1 of 1, failed: 1 ✘ [- ] process > PBP1A:get_pbp_alleles - [- ] process > PBP1A:finalise_pbp_existing_allele_results - [- ] process > PBP2B:get_pbp_alleles - [- ] process > PBP2B:finalise_pbp_existing_allele_results - [- ] process > PBP2X:get_pbp_alleles - [- ] process > PBP2X:finalise_pbp_existing_allele_results - Error executing process > 'get_pbp_genes (1)'
Caused by: Process
get_pbp_genes (1)
terminated with an error exit status (1)Command executed:
Build a blast reference database from the assmeblies
makeblastdb -in NIL12.fasta -dbtype nucl -out NIL12_contig_blast_db
Blast the blactam database against the blast reference database
blastn -db NIL12_contig_blast_db -query GBS_bLactam_Ref.fasta -outfmt 6 -word_size 7 -out NIL12_blast_blactam.out
Get BED file of PBP fragments
get_pbp_genes_from_contigs.py --blast_out_file NIL12_blast_blactam.out --query_fasta GBS_bLactam_Ref.fasta --frac_align_len_threshold 0.5 --frac_identity_threshold 0.5 --outputprefix NIL12
Clean directory
mkdir output mv NIL12_bed output mv NIL12.fasta output find . -maxdepth 1 -type f -delete unlink GBS_bLactamRef.fasta mv output/NIL12bed . mv output/NIL12.fasta . rm -d output
Command exit status: 1
Command output:
Building a new DB, current time: 08/23/2022 13:36:40 New DB name: NIL12_contig_blast_db New DB title: NIL12.fasta Sequence type: Nucleotide Keep MBits: T Maximum file size: 1000000000B Adding sequences from FASTA; added 105 sequences in 0.018683 seconds.
Command error: mv: cannot stat 'NIL12_*bed': No such file or directory
Work dir: /mnt/SSD2/genome_analysis/GBS-Typer-sanger-nf/GBS-Typer-sanger-nf/work/51/92c5c7826698a69fe6d7ecdf252785
Tip: when you have fixed the problem you can continue the execution adding the option
-resume
to the run command lineIt looks like the bed file is not created. Does it mean that my sample does not contain PBP and the pipeline crashes?
Any hints?
Bests, Jan