jtamames / SqueezeMeta

A complete pipeline for metagenomic analysis
GNU General Public License v3.0
346 stars 81 forks source link

repeat lca assigment with custom diamond file #797

Closed bresyd closed 4 months ago

bresyd commented 4 months ago

Hi,

I used sqm_annot.pl to annotate one of my genomes. I was particularly interested in the taxonomic annotation, to learn about the taxonomic origin of some particular genes in this genomes (some interesting genes were inherited horizontally). But there is one close relative of my genome already present in nr which is skewing my results. Because for most proteins, the best match to the close relative is considerably higher than all other matches, most of my proteins get the lca annotation matching the taxonomy of the close relative. What I would like to get though, is the the lca annotation not considering the matches to the close relative. What I have done so far is removing all the lines in the 04.xxx.nr.diamond file that pertain to matches to the close relative. My question now is how can I run the lca.pl script using this custom diamond file as input? I tried replacing the original 04.xxx.nr.diamond file with the custom made one and rerunning sqm_annot.pl file (after previously removing all files from the results directory), but sqm_annot.pl can not be run again using an existing project name. Do you have any suggestions for a workaround that could work for me?

Thanks a lot for your help

jtamames commented 4 months ago

Hello What you suggest is the easiest way to solve your issue. For skipping the project name check, edit SqueezeMeta.pl and comment lines 243-244:

if (($mode!~/sequential$/i) && (-d $projectdir) && (!$restart)) { print RED; print "Project name $projectdir already exists. Please remove it or change the project name\n"; print RESET; die; } 
elsif(!$restart && $mode ne "sequential") { system("mkdir $projectdir"); }

Also, comment lines 136-138 in sqm_annot.pl, for not redoing the Diamond:

$command="perl $scriptdir/04.rundiamond.pl $project $notax $blastmode";
    my $ecode = system $command;
    if($ecode!=0) { catch_error(); }

Run sqm_annot.pl, and that should do it. Don´t forget to undo all these changes when you finish. Best, J

jtamames commented 4 months ago

Indeed, it would be enough if, in addition to commenting 136-138 in sqm_annot.pl, you also comment lines 112-113:

$command="perl $scriptdir/04.rundiamond.pl $project $notax $blastmode";
    my $ecode = system $command;
    if($ecode!=0) { catch_error(); }

and 123-124:

    my $ecode = system $command;
        if($ecode!=0) { catch_error(); }

In this way, you do not need to change SqueezeMeta.pl. Best, J

bresyd commented 4 months ago

Hi,

with the changes to the sqm_annot.pl script this worked perfectly. Thanks again for the as always great support.

cheers

jtamames commented 4 months ago

Great! Closing issue