jtamames / SqueezeMeta

A complete pipeline for metagenomic analysis
GNU General Public License v3.0
346 stars 81 forks source link

ERROR: STEP14 -> BINNING & STEP 18 checkM_batch.pl #716

Closed sogada closed 11 months ago

sogada commented 11 months ago

Hello, I've tried running the pipeline and I'm getting the following errors as shown in the attached image. I have also attached the syslog file and one .tax file. I ran the test_install.pl script and all checks were successful. I'm running the pipeline on a server and installed it via Conda as instructed here. What could be the issue?

syslog.zip squeezemeta

jtamames commented 11 months ago

Hello! Your dataset is rather small and as a result, the bins are very low in size. All of them are skipped because of low length, and then the pipeline stops because of not finding any suitable bins. We will remove this check in upcoming versions, because it makes the pipeline to stop even if no error was produced. In the meantime, just restart the project via --restart -step 18 to finish the run. Best, J

fpusan commented 11 months ago

What is your OS and version?

The error in step 14 seems similar to the one described in #692. In short, recompiling metabat2 and jgi-summarize-contigs-depth from source may fix it for you. I need to find a more permanent fix, but it is difficult to distribute binaries that work in all distributions and versions.

This error means that metabat2 will fail, but the other binning methods (by default also CONCOCT, but you can also add MaxBin to the mix) will provide binning results so the run does not fail.

However you then have a second issue in which your bins are too short.

Is this happening with the test dataset that we provide when downloading the databases?

Otherwise as Javier says it maybe due to your data not producing suitable bins.

jtamames commented 11 months ago

Oh true, I didn't notice metabat2 failing. Sorry about that

sogada commented 11 months ago

Hello jtamames and fpusan, Thank you for your fast response. It's highly appreciated and commendable. I run Ubuntu 18 LTS. I have to admit, though, I used a small dataset for my test run. Let me run with more samples as suggested above and see what I'll get. I'll notify you once my run is complete. I will also try using the test data to rule out any data-related issues.

sogada commented 11 months ago

Hello jtamames and fpusan, I ran your test data and later my whole dataset. You were both right.

  1. The first issue was my small dataset. After running with my whole dataset, the pipeline ran to completion.
  2. Seems there's an issue with metabat2 failing in my machine (see attached photo). I got the same error with both my data and your test data. I think I can use concot and maxbin as alternatives. Does metabat2 offer any significant advantages over the others? github
fpusan commented 11 months ago

I think it should be ok to use CONCOCT+MaxBin and combine them with DAStool. The issue with metabat can be fixed if you recompile from source as described in #692, but you should be fine without it.

sogada commented 11 months ago

Thanks a lot for your help.