tseemann / prokka

:zap: :aquarius: Rapid prokaryotic genome annotation
831 stars 226 forks source link

Replace/Drop `tbl2asn` #444

Open apeltzer opened 4 years ago

apeltzer commented 4 years ago

Hi Torsten!

after seeing the usual 6-months "automated" failure of tbl2asn, which is probably something you can't change unfortunately due to upstream decisions by NCBI, I'm opening this issue to discuss whether it would be worth dropping that tool out of prokka entirely.

Your tool does an amazingly good job on annotation but more and more users might want to bundle and use your method e.g. for pipelines but have issues when the tbl2asn method runs out after latest 6 months. Having a reproducible pipeline that does certain things again and again when I go back to it in 2, 3, 4 years shouldn't be counterfeited by a single small tool with such a small footprint IMHO.

Opinions on this might vary I agree, but maybe it's worth the discussion seeing that you also have multiple users trying to use older prokka versions from time to time that don't work anymore due to tbl2asn failure.

Happy to help with at least some of this - as it keeps being annoying for everyone.

tseemann commented 4 years ago

You aren't the firtst to suggest this, and I agree it needs to be made at least optional.

For those using a packaging system, conda update prokka and brew upgrade prokka solve the problem, but it is not elegant and fails inside Docker containers and accredited pipelines which are fixed.

I'm just needing to find the time to see what not having GBK output does to other parts of the pipeline. Probably not much. I was hoping to save this until Prokka 2.0 but i might just need to "bite the bullet".

apeltzer commented 4 years ago

Thanks for the reply!

I know that it works fine using conda or brew, but whenever someone packages prokka with tbl2asn inside a container as you mentioned, this breaks after a while.

There were discussions coming up to replace it here, for example, https://github.com/nf-core/bacass/issues/12 , so I thought maybe I should first at least write an issue on Github ;-)

tseemann commented 4 years ago

It's clear why the author(s) of tbl2asn don't reveal themselves, because if they did, they would be confronted with an angry mob of bioinformaticians! :-)

EricDeveaud commented 4 years ago

maybe this can help.

you can disable the date checking of tbl2asn (also same procedure may apply to sequin)

c6builder:ncbitools/ncbitools-20170106 > cat disable_peremption_date.patch 
--- tbl2asn.c.ori       2019-11-28 16:43:52.679495000 +0100
+++ tbl2asn.c   2019-11-28 16:44:14.523273000 +0100
@@ -8623,10 +8623,10 @@
     return 1;
   }

-  if (MoreThanYearOld ()) {
-    too_old = TRUE;
-    Message (MSG_POST, "This copy of tbl2asn is more than a year old.  Please download the current version.");
-  }
+//  if (MoreThanYearOld ()) {
+//    too_old = TRUE;
+//    Message (MSG_POST, "This copy of tbl2asn is more than a year old.  Please download the current version.");
+//  }

   /* process command line arguments */

@@ -9324,9 +9324,9 @@
     return 1;
   }

-  if (too_old) {
-    return 1;
-  }
+//  if (too_old) {
+//    return 1;
+//  }

   return 0;
 }
tseemann commented 4 years ago

Wow - nice find!

Unfortunately I don't think the brew or conda versions of tbl2asn compile from source, so this patch can not me made until someone alters those recipes to build from source with that patch.

boegel commented 4 years ago

This is so annoying... 🤦‍♂

Are the sources for tbl2asn actually publicly somewhere? I can't see them anywhere on https://www.ncbi.nlm.nih.gov/genbank/tbl2asn2/?

I don't mind compiling this from source (& patching it to avoid the error in N months)...

apeltzer commented 4 years ago

I also looked for them but couldn't find the sources anywhere unfortunately. Same here, but gave up after not finding the sources :-(

boegel commented 4 years ago

FWIW: there's apparently a new "release" (25.8) yesterday.

I hate how they do in-place updates without properly versioning the tarball names too btw, grr...

boegel commented 4 years ago

I also looked for them but couldn't find the sources anywhere unfortunately. Same here, but gave up after not finding the sources :-(

OK, but apparently @EricDeveaud knows where to find the sources?

EricDeveaud commented 4 years ago

got tbl2asn compiled from ftp://ftp.ncbi.nih.gov/toolbox/ncbi_tools/old/20170106/ncbi.tar.gz maybee there is a more recent version. I did not checked if the C++ version of ncbitools embeds tbl2asn

Best regards

Eric

EricDeveaud commented 4 years ago

just checked, seems new C++ code contains tbl2asn also.ftp://ftp.ncbi.nih.gov/toolbox/ncbi_tools++/CURRENT/ncbi_cxx--22_0_0.tar.gz it has a src/app/table2asn/table2asn.cpp file

But I did not manage to compile this version, and did not check if it also suffers from the "peremption date"

Eric

cjfields commented 4 years ago

I suspect that development is moving to table2asn_GFF (which is an odd name, it takes GFF and converts to ASN1):

https://www.ncbi.nlm.nih.gov/genbank/genomes_gff/

This does work better than tbl2asn, but it's also not perfect (and I think you'd just be trading one pain for another).

standage commented 4 years ago

Commenting to follow this thread. This just made Prokka unusable on my systems.

peterjc commented 4 years ago

Apparently conda are shipping a wrapped version of tbl2asn which temporarily resets the date to trick the tool:

galaxyproject/tools-iuc#1707 (comment) https://github.com/audy/tbl2asn-forever

EricDeveaud commented 4 years ago

sad they introduce a dependency (libefaketime) instead fo just patchin tbl2asn code but easy solution to allow anyone with non patched tbl2asn to run

Eric

peterjc commented 4 years ago

See https://github.com/bioconda/bioconda-recipes/pull/20117 changing the prokka bioconda package to use tbl2asn-forever as a workaround (done).

zdk123 commented 4 years ago

FYI, I've reported a bug in tbl2asn-forever on MacOS (wrong exit code) but it hasn't been addressed. https://github.com/audy/tbl2asn-forever/issues/2 This caused some issues for me in running prokka specifically.

rpetit3 commented 4 years ago

Hi everyone!

I've updated tbl2asn-forever to 25.7.2f to work properly on Mac. I've also updated the Bioconda (https://github.com/bioconda/bioconda-recipes/pull/24110) recipe with a better test and libfaketime builds.

Please let me know if there are any issues. Would have gotten to it sooner, but only saw this yesterday.

zdk123 commented 4 years ago

No worries, this wasn't a show stopping bug. Thanks for taking the time to fix the issue!