nextgenusfs / funannotate

Eukaryotic Genome Annotation Pipeline
http://funannotate.readthedocs.io
BSD 2-Clause "Simplified" License
322 stars 86 forks source link

Is there a way to run funannotate update on existing output? #771

Open ceanothus opened 2 years ago

ceanothus commented 2 years ago

I am using funannotate v1.8.11. I left the update command running for 4 days and it did not finish before allotted time ran out, taking much longer than expected (I have roughly 24k gene predictions). It did however finish the first PASA comparison and almost finished the second.

Is there any way to have funannotate update pick up where it left off? Are there line-by-line commands I can enter/insert into a script to do that? Or do I have to start all over again with more allotted time?

It's very frustrating because this is the second time it's happened, and the rate at which PASA seems to be working is changing depending on how trafficked our server is.

hyphaltip commented 2 years ago

Two things. If you run pasa with MySQL server will vastly speedup pasa

and update should restart at a checkpoint if you just resubmit the job.

I assume you cannot set a longer runtime for your jobs To account for the time this step takes?

ceanothus commented 2 years ago

Thanks for getting back to me. I can set a longer runtime, but I can't extend the runtime when I see the job is about to run over; next time I'll be a little more generous when writing the script header.

Unfortunately it will be a whole process to get MySQL onto our server, so that isn't an option right now.

Will funannotate update only restart from checkpoints with MySQL? Or is that on either MySQL and SQLlite?

nextgenusfs commented 2 years ago

It will always reuse data if it exists, meaning that each sub routine must finish for it to be reused. It's not doing anything complex, just skipping running steps if the output file exists. So you should be able to just resubmit the same job again.