jtamames / SqueezeMeta

A complete pipeline for metagenomic analysis
GNU General Public License v3.0
381 stars 80 forks source link

External database - CAZy Annotations missing. #819

Open da-shanmugapriya opened 7 months ago

da-shanmugapriya commented 7 months ago

Hi, I have successfully added an external database (CAZy) to the squeezemeta pipeline (using SqueezeMeta v1.6.3). However, I have noticed that the annotations are missing from the files 12.{sample}.CAZyDB.funcover and 13.{sample}.orftable.

Can you help me in resolving this? I have attached the files for your perusal.

The zipped folder contains mydb.list, annotation file, syslog and results. CAZyDB_issue.zip

da-shanmugapriya commented 7 months ago

Hi team, I followed the Squeezemeta manual and tried the other methods mentioned in the issue sections, but I was unable to resolve the issue. Any help from your side would be greatly appreciated. Thanks

fpusan commented 7 months ago

Hi, thanks for the detailed report. It seems that the pipeline is assigning CAZy IDs to your ORFs, but the name of the CAZy functions is not appearing in table 13. For what I can tell your CAZyDB.Annotation.tsv is formatted correctly and as expected there is some overlap between the IDs in the CAZyDB.Annotation.tsv file and the IDs in table 13. But table 13 is indeed missing the function names. We'll look into this.

jtamames commented 7 months ago

Hello! I have located a small mistake in script 12 preventing the writing of extdb annotations. But everything seems ok in script 13, and it is working for me with some other data. May I ask you which SqueezeMeta version are you using? Best, J

jtamames commented 7 months ago

Btw, replace you script 12 with this one, and run it on your existing project. Please check if the 12*CAZy file is fixed that way.

Best, J

12.funcover.pl.gz

da-shanmugapriya commented 7 months ago

Hi J,

I'm running projects in version - SqueezeMeta 1.6.3, September 2023 After replacing the old script with 12.funcover.pl and trying it on two projects (existing and new), I still faced the same issue and received the same outputs. I have attached the results of the run for your review.

CAZy_issue_reply.zip

Thanks

jtamames commented 7 months ago

Hello I cannot reproduce the problem. I tried running one test result with your CAZy's annotation's file, and it works either in 12 and in 13. See for instance the output in 12:

Screenshot from 2024-04-04 11-11-27

So I am a bit lost here... sorry to ask, but can you check that the path for the annotation file /home/ubuntu/Downloads/db/Squeezemeta_cazy/CAZyDB.Annotation.tsv is correct and the file is there? Then, could you uncomment the line 78: print "*$dbname*$t[0]*path*$funs{$dbname}{$t[0]}{path}*$funs{$dbname}{$t[0]}{fun}*\n"; and the run 12 again and tell me the result? You should see a list of CAZys and annotations. Best, J

da-shanmugapriya commented 7 months ago

Hi J,

I have verified that the path to the annotation file is correct and the file is present. I successfully got the annotations by executing the code with a hard-coded path value of $listf = "/home/ubuntu/Downloads/db/Squeezemeta_CAZy/CAZyDB.Annotation.tsv". However, I am uncertain why the annotation file path is not picked up from my_db.list.tsv.

Thanks