jtamames / SqueezeMeta

A complete pipeline for metagenomic analysis
GNU General Public License v3.0
346 stars 81 forks source link

Problem with step 10, and step 14 #730

Closed jonaskohh closed 8 months ago

jonaskohh commented 9 months ago

Hello,

I have ran Squeezemeta and when I checked the log, I found that at step 10 I received a message:

samtools view: writing to standard output failed: Broken pipe samtools view: error closing standard output: -1

In addition, I am not able to bin using metabat2: "wc: /gpfs1/scratch/jonas/coassembly/plastic_LCK_coassembly/intermediate/binners/metabat2/: Is a directory"

I have restarted the run and same messages came up. Will be grateful if any help is given! syslog.pdf squeezemeta_trial_2.pdf squeezemeta_trial_1.pdf SqueezeMeta_conf.pdf

fpusan commented 9 months ago

The first syslog contains completed runs. Has SqueezeMeta worked for you in your system before? Is this error restricted to particular samples?

jonaskohh commented 9 months ago

The output of squeezemeta_trial_1 and squeezemeta_trial_2 were from the same run as the syslog. While all of them said run completed, I realised that they did not run metabat2 and found those messages as well.

This is my first time running metasqueeze as well.

fpusan commented 9 months ago

Also can you try using our dev version? You can install it with into an environment called SqueezeMeta_dev by running. mamba create -n SqueezeMeta_dev -c conda-forge -c bioconda -c anaconda -c fpusan squeezemeta-dev=1.6.3.beta1 --no-channel-priority You will have to re-link the databases you previously downloaded using configure_nodb.pl as shown in the ReadMe. This dev version should hopefully fix the problem with metabat2. It also fixes some problems with samtools (though it is the first time I see the one you reported)

jonaskohh commented 9 months ago

Sure, I will try it and update!! Thank you!

fpusan commented 9 months ago

I see those were your own samples. Is the error also happening when you run it with the test samples we provide?

jonaskohh commented 9 months ago

I will have to check on that. I’ll update on that as well :)

fpusan commented 9 months ago

This should not affect the main pipeline. Did it work with the test dataset? Sobre 27/09/2023 10:04:58, jonaskohh @.> escribió: Hello, I have reran and I just realised when I was doing the test run to see if everything is in place, I got the message "mothur NOT OK ERROR: Error running mothur -h" Is there a way to fix this? Thank you! — Reply to this email directly, view it on GitHub [https://github.com/jtamames/SqueezeMeta/issues/730#issuecomment-1736904153], or unsubscribe [https://github.com/notifications/unsubscribe-auth/ACAHO7TDGIT2WHNKJYTYQA3X4PM2VANCNFSM6AAAAAA42VWVUM]. You are receiving this because you commented.Message ID: @.> [92c8c2ca-ffaa-448c-a267-d96bbb4593a0]

jonaskohh commented 9 months ago

I have tested one the samples and while it showed the same error warning at step 10 (referring to squeezemeta_trial.pdf, output given by the server), I do see that there is mapping %tage reads given as 10.metasqueeze_test.mappingstat in the results. I think it worked still?

syslog.pdf squeezemeta_trial.pdf

fpusan commented 9 months ago

I think it failed at counting the ORF abundances. Can you psot the first lines of the /gpfs1/scratch/jonas/metasqueeze_test/intermediate/ 10.metasqueeze_test.mapcount file here?

jonaskohh commented 9 months ago

Created by /home/jonas/miniconda3/envs/SqueezeMeta_dev/SqueezeMeta/scripts/10.mapsamples.pl from /gpfs1/scratch/jonas/metasqueeze_test/results/03.metasqueeze_test.gff, Mon Oct 9 21:40:20 2023. SORTED TABLE

Gen Length Reads Bases RPKM Coverage TPM Sample megahit_1_90-323 234 0.000 0.000 0.000 SRR1929485 megahit_1_90-323 234 13 965 1.872 4.124 1.641 SRR1927149 megahit_2_2-76 75 0.000 0.000 0.000 SRR1929485 megahit_2_2-76 75 4 185 1.798 2.467 1.575 SRR1927149 megahit_2_262-345 84 0.000 0.000 0.000 SRR1929485 megahit_2_262-345 84 4 178 1.605 2.119 1.407 SRR1927149 megahit_3_1-144 144 1 86 0.732 0.597 0.642 SRR1929485 megahit_3_1-144 144 4 278 0.936 1.931 0.820 SRR1927149 megahit_4_1-168 168 0.000 0.000 0.000 SRR1927149 megahit_4_1-168 168 7 363 4.391 2.161 3.849 SRR1929485 megahit_5_1-396 396 10 794 0.851 2.005 0.746 SRR1927149 megahit_5_1-396 396 1 83 0.266 0.210 0.233 SRR1929485 megahit_6_1-393 393 0.000 0.000 0.000 SRR1929485 megahit_6_1-393 393 18 1694 1.544 4.310 1.353 SRR1927149 megahit_7_2-427 426 14 1201 1.108 2.819 0.971 SRR1927149 megahit_7_2-427 426 9 805 2.226 1.890 1.952 SRR1929485 megahit_8_1-309 309 11 895 1.200 2.896 1.051 SRR1927149 megahit_8_1-309 309 3 249 1.023 0.806 0.897 SRR1929485 megahit_9_3-302 300 13 990 1.461 3.300 1.280 SRR1927149 megahit_9_3-302 300 4 338 1.405 1.127 1.232 SRR1929485

fpusan commented 9 months ago

It would seem that it worked. I would check also the end of the file to ensure that there are results there. Still, I am suspicious, and I maybe would advice to repeat that step using only 1 thread

jonaskohh commented 8 months ago

Thank you! It all worked properly without any errors or warning messages. My current problem is when I used "sqm2zip.py" on HPC and I exported it out into my RStudio, this message popped up:

Error in data.table::fread(file, sep = "\t") : File '/var/folders/rk/ybfl2pgs1kzfmg8vcpn1bz9r0000gn/T//Rtmpm6juoE/metasqueeze_plastics.orf.sequences.tsv' does not exist or is non-readable. getwd()=='/Users/jonaskoh/Documents/Limchukang/LCK' In addition: Warning messages: 1: In loadSQM("/Users/jonaskoh/Desktop/metasqueeze_plastics.zip") : Your project was created with SqueezeMeta v1.6.2, while this is SQMtools v1.6.3. You can ignore this message if things are working fine for you, but if you experience any issue consider using the right version of SQMtools for this project 2: In (function (input = "", file = NULL, text = NULL, cmd = NULL, : Stopped early on line 8076790. Expected 49 fields but found 1. Consider fill=TRUE and comment.char=. First discarded non-empty line: <<PK>> 3: In unzip(project_path, file_path, exdir = tempdir(), junkpaths = TRUE) : zip file is corrupt

Screenshot 2023-10-28 at 6 31 55 PM
jonaskohh commented 8 months ago

Oh update: I've followed https://github.com/jtamames/SqueezeMeta/wiki/Using-SQMtools-(pre-1.6.2)-in-a-Windows-environment and everything is running well. Thank you so much for your help!!