jtamames / SqueezeMeta

A complete pipeline for metagenomic analysis
GNU General Public License v3.0
346 stars 81 forks source link

Issue with step 6 #733

Closed DrJoshVandenbrink closed 8 months ago

DrJoshVandenbrink commented 9 months ago

Hello,

Thanks for making this wonderful tool!

I am having issues with step 6 (LCA). I am running Coassembly on my workstation (32gb ram). However, once it gets to the LCA/step 6, the terminal will close and the analysis will stop. I've run the same data in sequential mode previously, but am looking to reduce the number of unmapped reads by using coassembly and a smaller contig size (150 bp). I've attached my syslog below. Any help would be greatly appreciated!

syslog.txt

jtamames commented 9 months ago

Hello Thanks for using SqueezeMeta! From your log, I see that step 6 is opening 13 threads when just 12 are expected. I guess your problem has something to do with that. I will look into it and tell you something asap. Best, J

DrJoshVandenbrink commented 9 months ago

Thanks for your help!

jtamames commented 9 months ago

Hello Try changing this: line 123 in the script 06.lca.pl, located in the scripts directory of your SqueezeMeta installation. Change:

my $splitlines=int($wc/$numthreads);

to

my $splitlines=ceil($wc/$numthreads);

And restart step 06 Best, J

DrJoshVandenbrink commented 9 months ago

Hello,

I changed 06.local.pl to what you suggested, and I added "use POSIX;" to the packages list to get the ceil() function to work, however the terminal is still force-closing and I find it not running when I return to my work station. Is it a memory issue? I've attached the updated syslog.

Thanks again for all your help in this matter!

Cheers, Josh

syslog.txt

jtamames commented 9 months ago

Hum... we had an issue with memory and threads not so long ago. Let's try reducing the number of threads. Edit the SqueezeMeta_conf.pl in the project directory and change: $numthreads = 12; by $numthreads = 6;

Then restart step 06 again.

DrJoshVandenbrink commented 9 months ago

I changed the number of threads to 6, and it still force closes the terminal without completing. Any other ideas?

jtamames commented 9 months ago

Could you send your diamond file to me? I can try to reproduce the error. You can upload it to wetransfer or similar Best, J

DrJoshVandenbrink commented 9 months ago

Sure!

I'm new to wetransfer, but I've created an account. Should I upload it to my account, or to yours?

jtamames commented 9 months ago

Yours, and send me a link to download. Thanks!

DrJoshVandenbrink commented 9 months ago

I was unsure what diamond file to include, as I have nr.diamond, kegg.diamond and eggnog.diamond, so I included all three. This is the link for the files:

https://we.tl/t-peHCr4sUak

DrJoshVandenbrink commented 9 months ago

Hello,

I was wondering if you got the files that I posted on wetransfer.

jtamames commented 9 months ago

Hello Yes, I got them. I have run the step 06 using your diamond file, and it worked fine, no issues, producing annotations for 2.3 M ORFs. I used the current version 1.6.3. Probably yours is older but I don´t think that is the cause of your problems. I would say it is related to RAM usage in your system. We will dig into this. In the meantime, would you like me to share the resulting files with you, so that you can go on with the analysis?

Best, J

DrJoshVandenbrink commented 9 months ago

That would be great!

jtamames commented 9 months ago

Here you have: https://we.tl/t-S8CYD9OIQq

Just copy these in the results directory and restart in 07. I did these using new version 1.6.3, that included a new database with the new phyla names following the guidelines of ICNP (https://ncbiinsights.ncbi.nlm.nih.gov/2022/11/14/prokaryotic-phylum-name-changes/). Probably that can cause some trouble if you are using and old database (not sure of this, but I guess so). It would be great if you can update your SqueezeMeta database, to keep up to date and avoid nasty behaviors.

Best, J

DrJoshVandenbrink commented 8 months ago

Hi Javier,

The analysis ran all the way through, however now when trying to load the data in R, I have receive this error:

Your project was created with a version of SqueezeMeta prior to 1.5. Running utils/versionchange.pl might fix this.

I am assuming this has to do with your previous comment about the database being outdated. I ran Co-assembly mode on SqueezeMeta version 1.6.2, and have SQMtools 1.6.3, so I am not sure where it is implying that I ran a version of SqueezeMeta prior to 1.5. Any thoughts?

Appreciate your help! I feel like I am so close!

Cheers, Josh

fpusan commented 8 months ago

Here SQMtools is checking whether a file named 20.*.contigtable is present in your project/results directory. If so, it complains about the version being older than 1.5, since now the file is supposed to be named 19.*.contigtable instead. What's the output of ls /path/to/project/results?

DrJoshVandenbrink commented 8 months ago

(base) josh@josh-B560-DS3H-AC-Y1:~/mambaforge/envs/SqueezeMeta/SqueezeMeta/scripts/MetaCo2/results$ ls 01.MetaCo2.fasta 10.MetaCo2.mappingstat 02.MetaCo2.16S.txt 11.MetaCo2.mcount 02.MetaCo2.rnas 12.MetaCo2.cog.funcover 02.MetaCo2.trnas 12.MetaCo2.kegg.funcover 02.MetaCo2.trnas.fasta 13.MetaCo2.orftable 03.MetaCo2.faa 18.MetaCo2.bintable 03.MetaCo2.fna 20.MetaCo2.contigtable 03.MetaCo2.gff 20.MetaCo2.kegg.pathways 06.MetaCo2.fun3.tax.noidfilter.wranks 20.MetaCo2.metacyc.pathways 06.MetaCo2.fun3.tax.wranks 21.MetaCo2.stats 07.MetaCo2.fun3.cog bins 07.MetaCo2.fun3.kegg tables 07.MetaCo2.fun3.pfam

fpusan commented 8 months ago

It would seem that the steps after 7 did not run. Is this the case?

DrJoshVandenbrink commented 8 months ago

syslog_MetaCo2.txt

It appears that it did not. And I don't see that step 8 ran either. I used the diamond files you provided and --restart step6 as my command to continue my analysis.

DrJoshVandenbrink commented 8 months ago

Sorry, I restarted step 7

DrJoshVandenbrink commented 8 months ago

Okay so I just re-ran the contig table script, and I am loading the data!

However, should I be concerned that step 7/8 did not appear to finish normally?

fpusan commented 8 months ago

8 won't run unless you added the -D flag, that's fine. But seem to be missing all the other steps too

DrJoshVandenbrink commented 8 months ago

That's odd because I did generate results. Here is the top of my contig table head_contig_table.txt

fpusan commented 8 months ago

Ok and by looking at the syslog the pipeline seems to have run. I don't understand why the files are not there. Eg what's the output of ls /home/josh/mambaforge/envs/SqueezeMeta/SqueezeMeta/scripts/MetaCo2/results/13*?

DrJoshVandenbrink commented 8 months ago

The results of that command are:

/home/josh/mambaforge/envs/SqueezeMeta/SqueezeMeta/scripts/MetaCo2/results/13.MetaCo2.orftable

fpusan commented 8 months ago

So then the file is actually there, maybe something wrong with the ls command you ran before. Can it be loaded in SQMtools now?

DrJoshVandenbrink commented 8 months ago

Yes, it all seems to be working fine now! Thanks for all your help, I really appreciate it!

fpusan commented 8 months ago

Glad to hear! closing issue