epi2me-labs / wf-alignment

Other
23 stars 16 forks source link

Process `pipeline:plotStats (1)` terminated with an error exit status (1) #8

Closed matteo1313 closed 1 year ago

matteo1313 commented 2 years ago

Hello ,

I am having issues at the very end of the wf-alignment workflow. The entire workflow runs and completes except pipeline:plotStats(1) which fails. I have attached the process list completion status and the output that I get for the error.

How do I correct this error? Would this error be the reason why my final_merged.csv is missing a read_n50 value?

I am happy to provide more details to help with debugging!

Thank you!

executor > local (21) [13/3a4896] process > start_ping:pingMessage (1) [100%] 1 of 1 ✔ [8d/d1d53a] process > handleSingleFile (1) [100%] 1 of 1 ✔ [f2/7d7001] process > pipeline:nameFastq (1) [100%] 1 of 1 ✔ [9d/6cdbc1] process > pipeline:combineReferences [100%] 1 of 1 ✔ [88/ee7ee1] process > pipeline:alignReads (1) [100%] 1 of 1 ✔ [37/e2383a] process > pipeline:mergeBAM (1) [100%] 1 of 1 ✔ [b3/98ae31] process > pipeline:indexBam (1) [100%] 1 of 1 ✔ [4c/9b1096] process > pipeline:refLengths [100%] 1 of 1 ✔ [a5/392231] process > pipeline:addStepsColumn [100%] 1 of 1 ✔ [e7/9baa0b] process > pipeline:readDepthPerRef (1) [100%] 1 of 1 ✔ [4f/f42a8d] process > pipeline:getParams [100%] 1 of 1 ✔ executor > local (21) [13/3a4896] process > start_ping:pingMessage (1) [100%] 1 of 1 ✔ [8d/d1d53a] process > handleSingleFile (1) [100%] 1 of 1 ✔ [f2/7d7001] process > pipeline:nameFastq (1) [100%] 1 of 1 ✔ [9d/6cdbc1] process > pipeline:combineReferences [100%] 1 of 1 ✔ [88/ee7ee1] process > pipeline:alignReads (1) [100%] 1 of 1 ✔ [37/e2383a] process > pipeline:mergeBAM (1) [100%] 1 of 1 ✔ [b3/98ae31] process > pipeline:indexBam (1) [100%] 1 of 1 ✔ [4c/9b1096] process > pipeline:refLengths [100%] 1 of 1 ✔ [a5/392231] process > pipeline:addStepsColumn [100%] 1 of 1 ✔ [e7/9baa0b] process > pipeline:readDepthPerRef (1) [100%] 1 of 1 ✔ [4f/f42a8d] process > pipeline:getParams [100%] 1 of 1 ✔ [c1/fac89a] process > pipeline:getVersions [100%] 1 of 1 ✔ [7b/9e7710] process > pipeline:getRefNames (1) [100%] 1 of 1 ✔ [88/de5182] process > pipeline:gatherStats (1) [100%] 1 of 1 ✔ [59/babcf3] process > pipeline:mergeCSV [100%] 1 of 1 ✔ [b4/a9825b] process > pipeline:plotStats (1) [100%] 1 of 1, failed: 1 ✘ [1e/bdf7df] process > output (2) [100%] 4 of 4 ✔ [a6/70835b] process > end_ping:pingMessage [100%] 1 of 1 ✔ Error executing process > 'pipeline:plotStats (1)'

Caused by: Process pipeline:plotStats (1) terminated with an error exit status (1)

Command executed:

report.py merged_mapula_json/ --report_name 'wf-alignment-Alignment_BSUB_06242022' --references GCF_000009045.1_ASM904v1_genomic.txt --unmapped_stats unmapped_stats/ --params params.json --versions versions --sample_names sample_ids.csv

Command exit status: 1

Command output: (empty)

Command error: Traceback (most recent call last): File "/home/matteo/.nextflow/assets/epi2me-labs/wf-alignment/bin/report.py", line 949, in main() File "/home/matteo/.nextflow/assets/epi2me-labs/wf-alignment/bin/report.py", line 931, in main stats_panel = PlotMappingStats( File "/home/matteo/.nextflow/assets/epi2me-labs/wf-alignment/bin/report.py", line 51, in init self.report = self.build_report( File "/home/matteo/.nextflow/assets/epi2me-labs/wf-alignment/bin/report.py", line 90, in build_report summary_tab = self.build_summary_tab(data) File "/home/matteo/.nextflow/assets/epi2me-labs/wf-alignment/bin/report.py", line 217, in build_summary_tab [self.plot_base_pairs(data)], File "/home/matteo/.nextflow/assets/epi2me-labs/wf-alignment/bin/report.py", line 442, in plot_base_pairs data['color'] = Category20c[len(base_pairs)] KeyError: 2

Work dir: /home/matteo/work/b4/a9825bdeaf1a33db856109c0f58f4d

Tip: view the complete command output by changing to the process work dir and entering the command cat .command.out

ruth-hanna commented 2 years ago

Hello! I am encountering the same error. This is the workflow log that I see when trying to run wf-alignment. I'm happy to provide any additional detail that would be helpful to debug! I tried installing Graphviz but it did not resolve the issue (although it is entirely possible that I installed it in the wrong place, I'm not very familiar with docker containers).

I'd very much appreciate any suggestions! :)

Checking epi2me-labs/wf-alignment ...

done - revision: 388906ba1b [v0.1.6]

N E X T F L O W ~ version 22.04.0

NOTE: Your local project version looks outdated - a different revision is available in the remote repository [dd6fa98e52]

Launching `https://github.com/epi2me-labs/wf-alignment` [confident_hoover] DSL2 - revision: 388906ba1b [v0.1.6]

Core Nextflow options

revision : v0.1.6

runName : confident_hoover

containerEngine: docker

launchDir : /Users/ruth/epi2melabs-data/nextflow

workDir : /Users/ruth/epi2melabs-data/nextflow/instances/2022-06-27-13-37_wf-alignment_KtWULpNXcnAQy8hctFLdU9/work

projectDir : /Users/ruth/.nextflow/assets/epi2me-labs/wf-alignment

userName : ruth

profile : standard

configFiles : /Users/ruth/.nextflow/assets/epi2me-labs/wf-alignment/nextflow.config

Input/output options

out_dir : /Users/ruth/epi2melabs-data/nextflow/instances/2022-06-27-13-37_wf-alignment_KtWULpNXcnAQy8hctFLdU9/output

fastq : /Users/ruth/Desktop/fastq

Reference genome options

references : /Users/ruth/Desktop/references

!! Only displaying parameters that differ from the pipeline defaults !!

------------------------------------------------------

If you use epi2me-labs/wf-alignment for your analysis please cite:

* The nf-core framework

https://doi.org/10.1038/s41587-020-0439-x

Checking fastq input.

Single directory input detected.

[41/6ce7cc] Submitted process > pipeline:getVersions

[bd/da2147] Submitted process > pipeline:getParams

[7f/4cd1d6] Submitted process > pipeline:getRefNames (1)

[2f/bc2bcb] Submitted process > pipeline:combineReferences

[ad/bd94c3] Submitted process > start_ping:pingMessage (1)

[c5/8733b6] Submitted process > pipeline:refLengths

[7b/4d0c86] Submitted process > end_ping:pingMessage

[f9/164e70] Submitted process > pipeline:addStepsColumn

[0b/5391c3] Submitted process > pipeline:fastcatUncompress (1)

[05/605560] Submitted process > pipeline:alignReads (1)

[51/862902] Submitted process > pipeline:mergeBAM (1)

[b3/740d28] Submitted process > pipeline:gatherStats (1)

[b6/b70b41] Submitted process > output (1)

[ec/ec0bf3] Submitted process > pipeline:indexBam (1)

[8c/7792a4] Submitted process > output (2)

[0a/23c07a] Submitted process > pipeline:readDepthPerRef (1)

[be/bb5981] Submitted process > pipeline:mergeCSV

[91/896954] Submitted process > pipeline:plotStats (1)

[2b/c13a10] Submitted process > output (3)

[b6/970490] Submitted process > output (4)

Error executing process > 'pipeline:plotStats (1)'

Caused by:

Process `pipeline:plotStats (1)` terminated with an error exit status (1)

Command executed:

report.py merged_mapula_json/* --report_name 'wf-alignment-report' --references Rp_Portsmouth_genome.txt --unmapped_stats unmapped_stats/* --params params.json --versions versions --sample_names sample_ids.csv

Command exit status:

1

Command output:

(empty)

Command error:

Traceback (most recent call last):

File "/Users/ruth/.nextflow/assets/epi2me-labs/wf-alignment/bin/report.py", line 949, in <module>

main()

File "/Users/ruth/.nextflow/assets/epi2me-labs/wf-alignment/bin/report.py", line 931, in main

stats_panel = PlotMappingStats(

File "/Users/ruth/.nextflow/assets/epi2me-labs/wf-alignment/bin/report.py", line 51, in __init__

self.report = self.build_report(

File "/Users/ruth/.nextflow/assets/epi2me-labs/wf-alignment/bin/report.py", line 90, in build_report

summary_tab = self.build_summary_tab(data)

File "/Users/ruth/.nextflow/assets/epi2me-labs/wf-alignment/bin/report.py", line 217, in build_summary_tab

[self.plot_base_pairs(data)],

File "/Users/ruth/.nextflow/assets/epi2me-labs/wf-alignment/bin/report.py", line 442, in plot_base_pairs

data['color'] = Category20c[len(base_pairs)]

KeyError: 2

Work dir:

/Users/ruth/epi2melabs-data/nextflow/instances/2022-06-27-13-37_wf-alignment_KtWULpNXcnAQy8hctFLdU9/work/91/896954ba97e38d9baa7eb8bbfca346

Tip: you can try to figure out what's wrong by changing to the process work dir and showing the script file named `.command.sh`

WARN: Killing running tasks (2)

WARN: To render the execution DAG in the required format it is required to install Graphviz -- See http://www.graphviz.org for more info.
matteo1313 commented 2 years ago

@ruth-hanna I am relieved to hear that I am not alone!! In your final_merged.csv are you getting a value for read_n50? Mine is missing, and I am curious to know if the plotStats error is causing this value to be missing.

ruth-hanna commented 2 years ago

Perhaps I'm looking in the wrong place but I don't see a .csv file -- only merged.sorted.aligned.bam and merged.sorted.aligned.bam.bai files (see screenshot).

Screen Shot 2022-06-27 at 1 58 11 PM
matteo1313 commented 2 years ago

@ruth-hanna I believe you are in the right directory. I have posted a snippet of the directory of my most recent run of this workflow.

As you can see I have all the files you have plus a .csv file that gives me information regarding the run. Although inside this file there is a variable called read_n50 which is missing. I have posted a snippet of the csv file to show you.

Screenshot from 2022-06-27 11-03-31

image

ruth-hanna commented 2 years ago

Hmm perhaps my workflow is terminating earlier than yours.

I tried downloading the test-data from this repository and the workflow ran successfully, so I think it must be something about our input files. What kind of input parameters are you using? I just have a single .fastq.gz file as input, and a .fa file as the reference, each in their own directory.

I'm going to look into this further and will report back if I figure it out!

matteo1313 commented 2 years ago

@ruth-hanna When you ran the test-data you also ran the plotStats portion of the pipeline? Did you get a .csv file with your output and was there a read_n50 value?

THANK YOU FOR WORKING WITH ME THROUGH THIS!!

Here is what I used as my command to run the workflow ( pasted below ). I have one input .fastq file and a .fna for a reference file. I also added in other parameters such as out_dir, concat_fastq, and report name.

nextflow run epi2me-labs/wf-alignment --out_dir /home/matteo/Documents/testing_outputs/06242022/ --fastq /home/matteo/Downloads/input_reads.fastq --references /home/matteo/Documents/GCF_000009045.1_ASM904v1_genomic.fna --concat_fastq false --report_name Alignment_BSUB_06242022

ruth-hanna commented 2 years ago

With the test-data, I did get the full output including a .csv file with a read_n50 value.

I am running the workflow from the EPI2ME-Labs application, with the following parameters:

`{

"out_dir": "/Users/ruth/epi2melabs-data/nextflow/instances/2022-06-27-14-22_wf-alignment_BZSz4NXHCjP5rBZ5oQvtfC/output",

"counts": "NO_COUNTS",

"depth_coverage": true,

"threads": 4,

"report_name": "report",

"wfversion": "v0.1.6",

"fastq": "/Users/ruth/Desktop/fastq",

"references": "/Users/ruth/Desktop/references"

}`

Thank you for posting this issue! It's good to know I'm not the only one running into this problem :)

matteo1313 commented 2 years ago

@ruth-hanna Gotcha. Thank you for letting me know. I am running the workflow through the command line and am not using the EPI2ME-Labs application. Ideally I will want to use this workflow in the cloud so I can not use the EPI2ME-Labs application.

ruth-hanna commented 2 years ago

okay this is puzzling but for some reason it runs successfully when I moved the input FASTQ file from the test dataset ("test.fastq") into the same directory as my FASTQ file. I do not understand why this seems to work but it does make the program run to completion, and my output file shows alignments to both input FASTQ files.

@matteo1313 wanna give this a try and see if it works for you?

matteo1313 commented 2 years ago

@ruth-hanna Okay! Thank you! It takes me ~2 hours to complete the entire workflow and hit the error message so I want to make sure I understand before I run it. I should download the test data bank and take the 'test.fastq' file from that data bank and put that fastq file in the same directory i have my input file (in the case of the command I posted above, my downloads).

For your input parameter are you just putting the path to the directory that the fastq is in? Or are you giving the path to the fastq file? I am giving the path to the file. Should I change it to the directory?

Also this is through the command line?

Thank you!

ruth-hanna commented 2 years ago

I downloaded the test_data, which has 3 folders: counts, fastq, and references. Then I moved my own FASTQ file in the fastq folder (my FASTQ file is the one called "PAM54351_pass_barcode10_5f265d2b_0.fastq") and my own reference file into the references folder (my file is called "Rp_Portsmouth_genome.fasta"). I then deleted the reference files that were originally included in the test_data but kept the FASTQ file that was there. The final file structure looks like this:

Screen Shot 2022-06-27 at 7 46 33 PM

I'm not at all sure why the extra FASTQ file makes it run, and not 100% sure that it doesn't mess with the results in any way so be forewarned...but I think since the test file has a distinct barcode from my files, it's easy enough to separate out the "test" reads.

I did run it on the command line, with the path to the directory:

./nextflow run epi2me-labs/wf-alignment --fastq Downloads/wf-alignment-master/test_data_copy2/fastq/ --references Downloads/wf-alignment-master/test_data_copy2/references --out_dir Desktop/output3

matteo1313 commented 2 years ago

Interesting! Okay! I am running it right now. I will keep you posted on what will come.(:

One thing I noticed is that there is a "counts" folder. My error points me to a working directory that contains a NO_COUNTS. It makes me wonder if the directory needs counts to make the workflow run? I wonder if this directory does not contain something that is needed to run the plotStats...

I also do not have any barcodes for my input. I have a fastq from my sequencer and a reference genome from NCBI.

Here is my input: nextflow run epi2me-labs/wf-alignment --fastq /home/matteo/Documents/fastq --out_dir /home/matteo/Documents/testing_ouputs --references /home/matteo/Documents/references --concat_fastq false

sarahjeeeze commented 2 years ago

Hi, Thanks for pointing this out. You are right it is broken for less than 2 samples, because the colour palette we are using only has keys [3-20]. We will get a fix in asap and will let you know when its ready.

matteo1313 commented 2 years ago

@sarahjeeeze THANK YOU SO MUCH!!

@ruth-hanna THANK YOU!!

ruth-hanna commented 2 years ago

@sarahjeeeze Thank you!!! Mystery solved :)

@matteo1313 Thank you for working through this with me yesterday!!

sarahjeeeze commented 2 years ago

Hi Again, the most recent tag v0.1.7 should fix the issue. Thanks again for pointing it out!

matteo1313 commented 2 years ago

@sarahjeeeze amazing thank you! I am running the workflow currently.

Is there anything specific we must do to run the most recent tag?

sarahjeeeze commented 2 years ago

Shouldn't be but if it seems to be still running v0.1.6 then you can try nextflow pull epi2me-labs/wf-alignment and run with the param-revision master

matteo1313 commented 2 years ago

@sarahjeeeze Hello Sarah,

For some reason I am still getting the same error message. I have attached the input parameters and the output to this thread.

Is there anyway to use the --resume function that is mentioned below? If not I can just rerun the entire workflow with the commands that you listed above

N E X T F L O W ~ version 22.04.4 NOTE: Your local project version looks outdated - a different revision is available in the remote repository [29f01e06dd] Launching https://github.com/epi2me-labs/wf-alignment [small_pare] DSL2 - revision: 6c4c39442f [master] Core Nextflow options revision : master runName : small_pare containerEngine: docker launchDir : /home/matteo workDir : /home/matteo/work projectDir : /home/matteo/.nextflow/assets/epi2me-labs/wf-alignment userName : matteo profile : standard configFiles : /home/matteo/.nextflow/assets/epi2me-labs/wf-alignment/nextflow.config

Input/output options out_dir : /home/matteo/Documents/testing_outputs/ fastq : /home/matteo/Downloads/wf-alignment-master/test_data/fastq/input_reads.fastq

Reference genome options references : /home/matteo/Downloads/wf-alignment-master/test_data/references/ref.fasta

Advanced options concat_fastq : false

!! Only displaying parameters that differ from the pipeline defaults !!

If you use epi2me-labs/wf-alignment for your analysis please cite:

executor > local (21) [c9/b57680] process > start_ping:pingMessage (1) [100%] 1 of 1 ✔ [47/de3f8c] process > handleSingleFile (1) [100%] 1 of 1 ✔ [ca/254954] process > pipeline:nameFastq (1) [100%] 1 of 1 ✔ [64/eb590a] process > pipeline:combineReferences [100%] 1 of 1 ✔ executor > local (21) [c9/b57680] process > start_ping:pingMessage (1) [100%] 1 of 1 ✔ [47/de3f8c] process > handleSingleFile (1) [100%] 1 of 1 ✔ [ca/254954] process > pipeline:nameFastq (1) [100%] 1 of 1 ✔ [64/eb590a] process > pipeline:combineReferences [100%] 1 of 1 ✔ [6f/cd99ac] process > pipeline:alignReads (1) [100%] 1 of 1 ✔ [fd/05ce60] process > pipeline:mergeBAM (1) [100%] 1 of 1 ✔ [7d/789874] process > pipeline:indexBam (1) [100%] 1 of 1 ✔ [a2/de682d] process > pipeline:refLengths [100%] 1 of 1 ✔ [25/0b0f79] process > pipeline:addStepsColumn [100%] 1 of 1 ✔ [9f/086f4c] process > pipeline:readDepthPerRef (1) [100%] 1 of 1 ✔ [d2/087797] process > pipeline:getParams [100%] 1 of 1 ✔ [d2/9728c1] process > pipeline:getVersions [100%] 1 of 1 ✔ [89/1c20d9] process > pipeline:getRefNames (1) [100%] 1 of 1 ✔ [81/1fdb71] process > pipeline:gatherStats (1) [100%] 1 of 1 ✔ [e7/a574ed] process > pipeline:mergeCSV [100%] 1 of 1 ✔ [0e/f47f67] process > pipeline:plotStats (1) [100%] 1 of 1, failed: 1 ✘ [2d/843e76] process > output (4) [100%] 1 of 1 [a4/d10751] process > end_ping:pingMessage [100%] 1 of 1 ✔ Error executing process > 'pipeline:plotStats (1)'

Caused by: Process pipeline:plotStats (1) terminated with an error exit status (1)

Command executed:

report.py merged_mapula_json/ --report_name 'wf-alignment-report' --references ref.txt --unmapped_stats unmapped_stats/ --params params.json --versions versions --sample_names sample_ids.csv

Command exit status: 1

Command output: (empty)

Command error: Traceback (most recent call last): File "/home/matteo/.nextflow/assets/epi2me-labs/wf-alignment/bin/report.py", line 949, in main() File "/home/matteo/.nextflow/assets/epi2me-labs/wf-alignment/bin/report.py", line 931, in main stats_panel = PlotMappingStats( File "/home/matteo/.nextflow/assets/epi2me-labs/wf-alignment/bin/report.py", line 51, in init self.report = self.build_report( File "/home/matteo/.nextflow/assets/epi2me-labs/wf-alignment/bin/report.py", line 90, in build_report summary_tab = self.build_summary_tab(data) File "/home/matteo/.nextflow/assets/epi2me-labs/wf-alignment/bin/report.py", line 217, in build_summary_tab [self.plot_base_pairs(data)], File "/home/matteo/.nextflow/assets/epi2me-labs/wf-alignment/bin/report.py", line 442, in plot_base_pairs data['color'] = Category20c[len(base_pairs)] KeyError: 2

Work dir: /home/matteo/work/0e/f47f67386db2fe0317399f391bec29

Tip: when you have fixed the problem you can continue the execution adding the option -resume to the run command line

matteo1313 commented 2 years ago

IT WORKS!

THANK YOU! @sarahjeeeze (:

obr22 commented 2 years ago

Hi @sarahjeeeze , I had a very similar issue and came across this thread. I was able to download v0.1.7 and when I input nextflow info epi2me-labs/wf-alignment it confirms that's the version running, but the Epi2me Labs application still shows v0.1.6 and when I try to run the workflow the workflow log still says v0.1.6 and that I am using an outdated version. The run also still failed with the same error as before. Workflow log attached. Any advice would be appreciated. Thanks!

nextflow.log

sarahjeeeze commented 2 years ago

Thanks! Sorry for that. I have just updated it in EPI2MELabs, it is v0.1.8 now.

obr22 commented 2 years ago

Thanks for the response. I updated the application and workflow but still got the error "Error executing process > 'pipeline:alignReads (1)'" .

Log is attached. Any fixes or ideas of what I might be doing wrong would be appreciated. Thanks again!

UpdateError.txt

sarahjeeeze commented 2 years ago

The error you are getting is what I would expect to see if there were no alignments, maybe check your input fastq files and reference file contain data/are in correct format etc? I will try to improve the error message when there are no alignments.

julibeg commented 1 year ago

closing this for now; feel free to re-open if the problem persists