Closed hkunerth closed 7 months ago
@hkunerth can you check the contents of 230029460_WB.filtered.report to ensure that it isn't empty? Also, check that the reference FASTA files during DOWNLOAD_ASSEMBLY are being made.
I don't believe it's an issue with Top Hits but the way in which the fastq files are being registered in the overall pipeline.
The 230029460_WB.filtered.report exists and looks normal. DOWNLOAD_ASSEMBLY successfully downloaded for the other sample, 230029461, but never initiated for 230029460.
I can send you the nextflow report if it would be helpful at all. Happy to keep digging to try to solve this.
Sure, can you pass the report privately if possible? I'm curious if there is some underlying issue with how the system might be parsing some of the taxa downstream
Hi @Merritt-Brian, I've come back to this and I think the issue is simpler than what I'd initially thought. I believe it's centered on the Samplesheet format. I had been using an older format, with the following header row:
sample,single_end,from,platform,barcode,fastq_1,fastq_2,sequencing_summary,trim
I tested some Illumina samples using the example format:
sample,platform,fastq_1,fastq_2,sequencing_summary,trim
and no longer ran into the issue. I think this is due to the older fields being put into the META variable during the samplesheet_check process and passed along to later processes which have issues with too many or too few arguments.
That said, I still hit a snag when running ONT data using the updated samplesheet format. I kept the format the same, except for changing platform to OXFORD and not having a fastq_2. This generated the old error:
Nov-29 09:38:20.942 [Actor Thread 9] DEBUG nextflow.Session - Session aborted -- Cause: No signature of method: Script_b7c7d140$_runScript_closure1$_closure3$_closure24.call() is applicable for argument types: (ArrayList) values: [[[id:Specimen-5-NP-RNA, single_end:true, platform:OXFORD, fastq_1:/home/mdh/shared/taxtriage/231129_test/samples/Specimen-5-NP-RNA.dehosted.fastq.gz, ...], ...]] Possible solutions: any(), any(), any(groovy.lang.Closure), each(groovy.lang.Closure), tap(groovy.lang.Closure), any(groovy.lang.Closure) Nov-29 09:38:20.970 [Actor Thread 9] DEBUG nextflow.Session - The following nodes are still active:
It looks to me like there should be more arguments in that array. The successful Illumina run populates it with
[id:230028305_CSF, single_end:false, platform:ILLUMINA, fastq_1:/home/mdh/shared/taxtriage/231129_test/samples_illu/230028305.Illumina.kraken.dehosted.1.fastq.gz, fastq_2:/home/mdh/shared/taxtriage/231129_test/samples_illu/230028305.Illumina.kraken.dehosted.2.fastq.gz, trim:true, directory:false, sequencing_summary:null]
but for some reason it breaks for ONT after fastq_1.
My sample sheet just leaves the fastq_2 field blank, as it is in the ONT example here: https://github.com/jhuapl-bio/taxtriage/blob/main/examples/Samplesheet.csv but I'm wondering if this might be causing issues with the META field.
Thoughts?
Nevermind, I ran some newly generated Illumina data and ran into the same issue. It looks to be the same as this issue https://github.com/jhuapl-bio/taxtriage/issues/45 raised by @erinyoung
Disregard my above message, but any help with this would be much appreciated. Thanks!
@hkunerth can you provide your .nextflow.log file as well as the execution report (html) here regarding the issue put in #45
Here's a log from a failed run with this issue. .nextflow.log
I'm wondering if it is possible that the database that this run is using might be the cause of it. I changed a number of things in more recent runs but one was making sure it is pointed at the standard database (k2_standard_20230605) and I haven't run into this issue since then.
Thanks for the help.
Ah found the syntax error in your command
--top_per_taxa = '"10239:20:S' '2:20:S"'
should be: --top_per_taxa "10239:20:S 2:20:S"
i.e. no single quote and no equals sign. You can also see that on startup the value for top_per_taxa states it is "="
not "10239:20:S 2:20:S"
Description of the bug
It seems like some recent changes to the top hits report generation may have introduced some sort of mis-specified array:
Here's my error:
RROR nextflow.extension.OperatorImpl - @unknown groovy.lang.MissingMethodException: No signature of method: Script_861716d0$_runScript_closure1$_closure2$_closure22.call() is applicable for argument types: (ArrayList) values: [[[id:230029461_WB, single_end:false, platform:ILLUMINA, fastq_1:/home/mdh/shared/taxtriage/231005_test/samples/230029461.Illumina.kraken.dehosted.1.fastq.gz, ...], ...]]
The log file truncates it but grabbing it from the slurm output:
ERROR ~ Invalid method invocation
call
with arguments: [[id:230029461_WB, single_end:false, platform:ILLUMINA, fastq_1:/home/mdh/shared/taxtriage/231005_test/samples/230029461.Illumina.kraken.dehosted.1.fastq.gz, fastq_2:/home/mdh/shared/taxtriage/231005_test/samples/230029461.Illumina.kraken.dehosted.2.fastq.gz, trim:false, directory:false, sequencing_summary:null], /panfs/jay/groups/32/mdh/shared/taxtriage/231005_test/work/66/89eb4288f373a6ad78f6d9d8079efd/230029461_WB.top_report.tsv, [/panfs/jay/groups/32/mdh/shared/taxtriage/231005_test/work/a0/587a499c14235a2d369c0e418fca23/230029461_WB.classified_1.fastq.gz, /panfs/jay/groups/32/mdh/shared/taxtriage/231005_test/work/a0/587a499c14235a2d369c0e418fca23/230029461_WB.classified_2.fastq.gz], /panfs/jay/groups/32/mdh/shared/taxtriage/231005_test/work/4e/923481f65a775d1ffa8f53afbea9bb/230029461_WB.output.references.fasta] (java.util.ArrayList) on _closure22 typeI haven't had a chance to do much digging, but the addition of the $2 variable in the top_hits.nf module might be breaking things?
Command used and terminal output
Relevant files
Here's the command.sh from the work directory where this breaks:
!/bin/bash -euo pipefail
echo 230029460_WB "-----------------META variable------------------" get_top_hits.py \ -i "230029460_WB.filtered.report" \ -o 230029460_WB.top_report.tsv \ -t 50
awk -F '\t' -v id=230029460_WB \ 'BEGIN{OFS="\t"} { if (NR==1){ print "SampleTaxid", $2, $1, $4, $6} else { $5 = id""$5; print $5, $2, $1, $4, $6 }}' 230029460_WB.top_report.tsv > 230029460_WB.krakenreport_mqc.tsv
cat <<-END_VERSIONS > versions.yml "NFCORE_TAXTRIAGE:TAXTRIAGE:TOP_HITS": python: $(python --version | sed 's/Python //g') END_VERSIONS
nextflow.log
System information
Nextflow version 24.04.2 Hardware HPC, Desktop, Cloud Executor slurm Container engine: Singularity OS CentOS Linux Version of nf-core/taxtriage 1.2.0