Open bgruening opened 9 years ago
+1
A problem we've had with tbl2asn for Prokka wrapper is that the binary is versioned internally, but the file is just overwritten on the FTP without changing its name. Moreover, GenBank often increases the minimum version required, so it needs to be frequently updated.
@nsoranzo yes this is always a problem, but for our pipeline we need this. Btw. I have worked some time ago on your missing perl libraries for prokka. Are you interested in using them (if there are working?)
@bgruening I'm not working on Prokka at the moment, it's in the hand of CRS4 people. You may try to do a pull request, but they are not using tool_dependencies.xml to install tools (and I'm not either).
I have an idea for storing versioned copies of tbl2asn, I'll make sure we have copies available. @bgruening do you have wrappers written for it already?
ons. 22. apr. 2015, 08.01 skrev Nicola Soranzo notifications@github.com:
@bgruening https://github.com/bgruening I'm not working on Prokka at the moment, it's in the hand of CRS4 people. You may try to do a pull request, but they are not using tool_dependencies.xml to install tools (and I'm not either).
— Reply to this email directly or view it on GitHub https://github.com/bgruening/galaxytools/issues/102#issuecomment-95166194 .
Not yet, but storing versioned copies is not the problem I guess. The NCBI only accepts sequin (or similar) files that are processed with a recent version. For example they ship some kind of word-blastlist in this binary to check for spelling mistakes or wrong annotated genomes/proteins. @erasche haven't started yet, sorry.
yep, that's a problem for people updating their galaxy :) (and us having automated package updates when new versions of tbl2asn come out)
@bgruening https://github.com/galaxyproject/docker-build/blob/v2/tbl2asn/default/build.yml versioned tbl2asn packages here. Once @natefoo gets docker-build running jobs on cron we'll just run this weekly/monthly, and I'll add a job to generate automated PRs against the tbl2asn package in Galaxy, and then that coupled with the automated TTS pushes mean...no stress for us! :)
(Hey, @natefoo, do you just want me to add jenkins jobs for building these packages? I'm happy to, if you can send me an SSH pubkey, I'll ensure built packages go in a single directory, and you can regularly pull from the IUC's build server into depot.)
@erasche, @nsoranzo the question is do we need this? This tools is only useful in the most recent version, it is a deadend tool, isn't it? We always need the latest version (?) Does it make sense to enable reproducibility for this tool by versioning binaries?
If we need this why not coping tbl2asn
builds from ncbi every month/release to depot?
You don't always need to have the latest version, but they get deprecated very fast. And when they are, you really have to update.
@bgruening no we probably don't need versioned copies since old ones are useless. However, I feel like a (completely, 100% automated) updating of the version in the TS is preferrable to the tool, on every run, checking if the binary is older than N days and if so fetching the latest.
@erasche agreed, but are you talking about a with every new binary? I would go with package_tbl2asn_latest
or something like this. This has the advantage to not update the tool-version with every release.
Yeah, I was talking about a _latest
that would get updated. not registering a new package_tbl2asn_$date
as that'd just be clutter. Would that work for you?
:+1: for package_tbl2asn_latest
Great, I'll make sure the IUC's jenkins bot can open PRs and set it up to trigger a "tbl2asn definition" update job whenever docker-build creates a new tbl2asn version.
@erasche Works for me. Public key is at https://github.com/natefoo.keys
@natefoo mind testing that you can login/pull data? You should be able to ssh in as natefoo@gx.hx42.org
and you'll find jenkins will publish all produced files to the data/
directory in your home folder (/opt/depot/data/
).
I'll set up jenkins/docker to build + place more binaries in there and ensure that they're versioned.
Two years later, but since this issue is still 'open'...
I would like to have tbl2asn
as a standalone tool for use within various annotation and submission workflows. In checking to see if it exists I ran across this thread. However, it doesn't appear that tbl2asn itself was ever wrapped but rather included in other pipeline-based tools. Is this right? Am I duplicating anything existing if I wrap tbl2asn
itself?
I see that tbl2asn
is already in Bioconda. As far as the versioning issues above (which still apply), in my opinion it's not too much to expect system admins to update their versions periodically. This could easily be done with a cron job every month.
@jvolkening happy to accept a PR with this tool. BioConda will make it easier for us to maintain it. Let's close this once and for all.
Tbl2asn is a command-line program that automates the creation of sequence records for submission to GenBank. It uses many of the same functions as Sequin but is driven generally by data files. Tbl2asn generates .sqn files for submission to GenBank. Additional manual editing is not required before submission.
http://www.ncbi.nlm.nih.gov/genbank/tbl2asn2/