jtamames / SqueezeMeta

A complete pipeline for metagenomic analysis
GNU General Public License v3.0
357 stars 78 forks source link

A binning issue #656

Closed blaizereal closed 1 year ago

blaizereal commented 1 year ago

At binning step 14 I experienced the following issues, checked 4-5 samples (I tried using Metabat2 and Maxbin instead of Concoct but it didn't help): Also manually create missing "concoct_int" subfolder did not move things forward :(

[30 minutes, 15 seconds]: STEP14 -> BINNING: 14.runbinning.pl
sh: 1: cannot create /media/ngs_lab/6.4Tb-NVRAM/SqueezeMeta-1.6.2/scripts/test2/intermediate/binners/concoct/concoct_int/clustering_merged.csv: Directory nonexistent
wc: /media/ngs_lab/6.4Tb-NVRAM/SqueezeMeta-1.6.2/scripts/test2/intermediate/binners/concoct/: Is a directory
wc: /media/ngs_lab/6.4Tb-NVRAM/SqueezeMeta-1.6.2/scripts/test2/intermediate/binners/metabat2/: Is a directory
[30 minutes, 45 seconds]: STEP15 -> DAS_TOOL MERGING: 15.dastool.pl
Error running command:    LD_LIBRARY_PATH=/media/ngs_lab/6.4Tb-NVRAM/SqueezeMeta-1.6.2/lib PATH=/media/ngs_lab/6.4Tb-NVRAM/SqueezeMeta-1.6.2/bin:$PATH /media/ngs_lab/6.4Tb-NVRAM/SqueezeMeta-1.6.2/bin/DAS_Tool/DAS_Tool -i  -l  -c /media/ngs_lab/6.4Tb-NVRAM/SqueezeMeta-1.6.2/scripts/test2/results/01.test2.fasta --write_bins 1 --score_threshold 0 --search_engine diamond -t 32 -o /media/ngs_lab/6.4Tb-NVRAM/SqueezeMeta-1.6.2/scripts/test2/intermediate/binners/DAS/test2 --db_directory /media/ngs_lab/6.4Tb-NVRAM/SqueezeMeta-1.6.2/DB/db at /media/ngs_lab/6.4Tb-NVRAM/SqueezeMeta-1.6.2/scripts/15.dastool.pl line 94.
mv: cannot stat '/media/ngs_lab/6.4Tb-NVRAM/SqueezeMeta-1.6.2/scripts/test2/intermediate/binners/DAS/test2_DASTool_bins/*': No such file or directory
wc: /media/ngs_lab/6.4Tb-NVRAM/SqueezeMeta-1.6.2/scripts/test2/results/bins/: Is a directory
WARNING: File /media/ngs_lab/6.4Tb-NVRAM/SqueezeMeta-1.6.2/scripts/test2/results/bins/ is empty!. DAStool did not generate results
Skipping BIN TAX ASSIGNMENT: DAS_Tool did not predict bins.
Skipping CHECKM: DAS_Tool did not predict bins.
Skipping BIN TABLE CREATION: (You already know: DAS_Tool did not predict bins.)
[30 minutes, 45 seconds]: STEP19 -> CREATING CONTIG TABLE: 19.getcontigs.pl
  Reading taxa for contigs information...done!
  Reading GC & length... done!
  Reading number of genes... done!
  Reading coverages... done!
  Reading bins... done!
  Creating contig table...done!
============
CONTIG TABLE CREATED: /media/ngs_lab/6.4Tb-NVRAM/SqueezeMeta-1.6.2/scripts/test2/results/19.test2.contigtable
============

Skipping MINPATH: DAS_Tool did not predict bins.

[30 minutes, 45 seconds]: STEP21 -> MAKING FINAL STATISTICS: 21.stats.pl
  Output in /media/ngs_lab/6.4Tb-NVRAM/SqueezeMeta-1.6.2/scripts/test2/results/21.test2.stats

Deleting temporary files in /media/ngs_lab/6.4Tb-NVRAM/SqueezeMeta-1.6.2/scripts/test2/temp
[30 minutes, 46 seconds]: FINISHED -> Have fun!

WARNINGS:
DAS Tool abnormal termination: file /media/ngs_lab/6.4Tb-NVRAM/SqueezeMeta-1.6.2/scripts/test2/results/bins/ is empty. There are NO BINS!

Thanks in advance for any comments! syslog.zip

fpusan commented 1 year ago

It seems DAStool had a problem.

Can you run

LD_LIBRARY_PATH=/media/ngs_lab/6.4Tb-NVRAM/SqueezeMeta-1.6.2/lib PATH=/media/ngs_lab/6.4Tb-NVRAM/SqueezeMeta-1.6.2/bin:$PATH /media/ngs_lab/6.4Tb-NVRAM/SqueezeMeta-1.6.2/bin/DAS_Tool/DAS_Tool -i -l -c /media/ngs_lab/6.4Tb-NVRAM/SqueezeMeta-1.6.2/scripts/test2/results/01.test2.fasta --write_bins 1 --score_threshold 0 --search_engine diamond -t 32 -o /media/ngs_lab/6.4Tb-NVRAM/SqueezeMeta-1.6.2/scripts/test2/intermediate/binners/DAS/test2 --db_directory /media/ngs_lab/6.4Tb-NVRAM/SqueezeMeta-1.6.2/DB/db

and see what the error is?

jtamames commented 1 year ago

I think your dataset is not very big and was not able to produce any bins. Could you paste the stats in the 21.stats file? Can you try again the Hadza data? Best, J

blaizereal commented 1 year ago

It seems DAStool had a problem.

Can you run

LD_LIBRARY_PATH=/media/ngs_lab/6.4Tb-NVRAM/SqueezeMeta-1.6.2/lib PATH=/media/ngs_lab/6.4Tb-NVRAM/SqueezeMeta-1.6.2/bin:$PATH /media/ngs_lab/6.4Tb-NVRAM/SqueezeMeta-1.6.2/bin/DAS_Tool/DAS_Tool -i -l -c /media/ngs_lab/6.4Tb-NVRAM/SqueezeMeta-1.6.2/scripts/test2/results/01.test2.fasta --write_bins 1 --score_threshold 0 --search_engine diamond -t 32 -o /media/ngs_lab/6.4Tb-NVRAM/SqueezeMeta-1.6.2/scripts/test2/intermediate/binners/DAS/test2 --db_directory /media/ngs_lab/6.4Tb-NVRAM/SqueezeMeta-1.6.2/DB/db

and see what the error is?

Gives this message: scaffolds2bin file not found: -l Aborting.

blaizereal commented 1 year ago

I think your dataset is not very big and was not able to produce any bins. Could you paste the stats in the 21.stats file? Can you try again the Hadza data? Best, J

Yeah, it is a smaller dataset, but I can generate visuals and tables with the SQMtools (as you can see on the attached images), so there must be something there :) Anyway, I'm going to try a species-rich sample...

image

image

image

blaizereal commented 1 year ago

Another observations:

wc: /media/ngs_lab/6.4Tb-NVRAM/SqueezeMeta-1.6.2/scripts/test2/intermediate/binners/concoct/: Is a directory
wc: /media/ngs_lab/6.4Tb-NVRAM/SqueezeMeta-1.6.2/scripts/test2/intermediate/binners/metabat2/: Is a directory
[16 seconds]: STEP14 -> BINNING: 14.runbinning.pl
wc: /media/ngs_lab/6.4Tb-NVRAM/SqueezeMeta-1.6.2/scripts/test2/intermediate/binners/concoct/: Is a directory
sh: 1: cannot create /media/ngs_lab/6.4Tb-NVRAM/SqueezeMeta-1.6.2/scripts/test2/intermediate/binners/concoct/concoct_int/clustering_merged.csv: Permission denied
wc: /media/ngs_lab/6.4Tb-NVRAM/SqueezeMeta-1.6.2/scripts/test2/intermediate/binners/concoct/: Is a directory
wc: /media/ngs_lab/6.4Tb-NVRAM/SqueezeMeta-1.6.2/scripts/test2/intermediate/binners/metabat2/: Is a directory
wc: /media/ngs_lab/6.4Tb-NVRAM/SqueezeMeta-1.6.2/scripts/test2/intermediate/binners/metabat2/: Is a directory
wc: /media/ngs_lab/6.4Tb-NVRAM/SqueezeMeta-1.6.2/scripts/test2/results/bins/: Is a directory
[46 seconds]: STEP15 -> DAS_TOOL MERGING: 15.dastool.pl
rm: remove write-protected regular file '/media/ngs_lab/6.4Tb-NVRAM/SqueezeMeta-1.6.2/scripts/test2/intermediate/binners/DAS/test2_DASTool.log'? y
rm: cannot remove '/media/ngs_lab/6.4Tb-NVRAM/SqueezeMeta-1.6.2/scripts/test2/results/bins/*': No such file or directory
Error running command:    LD_LIBRARY_PATH=/media/ngs_lab/6.4Tb-NVRAM/SqueezeMeta-1.6.2/lib PATH=/media/ngs_lab/6.4Tb-NVRAM/SqueezeMeta-1.6.2/bin:$PATH /media/ngs_lab/6.4Tb-NVRAM/SqueezeMeta-1.6.2/bin/DAS_Tool/DAS_Tool -i  -l  -c /media/ngs_lab/6.4Tb-NVRAM/SqueezeMeta-1.6.2/scripts/test2/results/01.test2.fasta --write_bins 1 --score_threshold 0 --search_engine diamond -t 32 -o /media/ngs_lab/6.4Tb-NVRAM/SqueezeMeta-1.6.2/scripts/test2/intermediate/binners/DAS/test2 --db_directory /media/ngs_lab/6.4Tb-NVRAM/SqueezeMeta-1.6.2/DB/db at /media/ngs_lab/6.4Tb-NVRAM/SqueezeMeta-1.6.2/scripts/15.dastool.pl line 94.
mv: cannot stat '/media/ngs_lab/6.4Tb-NVRAM/SqueezeMeta-1.6.2/scripts/test2/intermediate/binners/DAS/test2_DASTool_bins/*': No such file or directory
wc: /media/ngs_lab/6.4Tb-NVRAM/SqueezeMeta-1.6.2/scripts/test2/results/bins/: Is a directory
WARNING: File /media/ngs_lab/6.4Tb-NVRAM/SqueezeMeta-1.6.2/scripts/test2/results/bins/ is empty!. DAStool did not generate results
Skipping BIN TAX ASSIGNMENT: DAS_Tool did not predict bins.
Skipping CHECKM: DAS_Tool did not predict bins.
Skipping BIN TABLE CREATION: (You already know: DAS_Tool did not predict bins.)

Thanks: B

fpusan commented 1 year ago

Thanks for the info! I wonder why you don't have writing permissions there. Of course if you don't have writing permissions in the output directory then things are not going to work fine, but if that was the case then the pipeline would have failed before. Can you check the owner and write permissions in /media/ngs_lab/6.4Tb-NVRAM/SqueezeMeta-1.6.2/scripts/test2/ and its subdirectories? What kind of drive is this?

fpusan commented 1 year ago

Did you by any chance run some of the commands as a super-user?

blaizereal commented 1 year ago

Thanks for the info! I wonder why you don't have writing permissions there. Of course if you don't have writing permissions in the output directory then things are not going to work fine, but if that was the case then the pipeline would have failed before. Can you check the owner and write permissions in /media/ngs_lab/6.4Tb-NVRAM/SqueezeMeta-1.6.2/scripts/test2/ and its subdirectories? What kind of drive is this?

Yes, I am a superuser. The problem is that Concoct creates this folder as root user permissions (check attached image), what is very interesting. Also it is a local M.2 drive, without any permission issue until now. image

fpusan commented 1 year ago

Ok, I think that explains that, and potentially also the issues with Step 10 that you reported before (I am however running the test data in an Ubuntu 22 VM to make sure there's nothing wrong). Can you run SqueezeMeta without super user privileges. In general I would advice to not run anything as a super user unless you really really need it.

blaizereal commented 1 year ago

OK, I try running SM. as normal user. Also changing permissions Concoct intermediate stuffs to 777 resulted an empty clustering_merged.csv file, this is why bins are empty. Thanks!

fpusan commented 1 year ago

Just confirmed that the pipeline (including binning) works ok in a clean Ubuntu 22.04.2 VM, so the issue possibly had to do with running the pipeline as super user

blaizereal commented 1 year ago

Thanks for the information, be sure to further test the pipeline with your suggested user settings.