nextgenusfs / funannotate

Eukaryotic Genome Annotation Pipeline
http://funannotate.readthedocs.io
BSD 2-Clause "Simplified" License
300 stars 82 forks source link

Can we set up a point about pasa on update stage? #984

Closed sunnycqcn closed 4 months ago

sunnycqcn commented 7 months ago

Hello developer, PASA usually should take long time. Sometimes, it will be broken down due to server maintenance. Could you set up a time point on this step? When we re-run pipeline, we do not need rerun the whole things. Thanks, Fuyou

hyphaltip commented 7 months ago

it should restart even if failed - but if you run mysql/mariadb version would be faster than sqlite version.

sunnycqcn commented 7 months ago

Hello Jason, Thank you very much for replying. I know it is much faster if I run MySQL. However, I do not have right to install MySQL on our server. I run funannotate under conda. Thanks, Fuyou

it should restart even if failed - but if you run mysql/mariadb version would be faster than sqlite version.

hyphaltip commented 4 months ago

sqlite versions should restart unless there was corruption when the job is interrupted. let us know if you need more input on this. we run mysql with singularity container on our cluster to allow for permission to run.

jasongallant commented 4 months ago

Hi Jason

Curious if you could provide more details about how you configure mysql with the singularity container on your cluster? I'm currently running the pipeline on a fresh EC2 install with 32 processors, and have been running into tons of issues trying to make MYSQL work properly. I realize the funannotate2 docker instance doesn't support MYSQL out of the box...

Essentially, I had to go through each of the PASA scripts and manually add a "socket" parameter to get it to connect to MYSQL, and change the default characterset in MYSQL to get it to even work. Things hummed along pretty nicely for a while, but I'm currently stuck at the assemble_clusters.dbi portion of the PASA pipeline.

Ultimately, I think this is due to underlying issues some inefficiencies in how threads are handled in this code. There seems to be some problems with how threading works that cause random segfaults. I just keep restarting and then it proceeds, but it is asymptotic as the chromosomes get larger and there are more genes on them. That's a known issue by the PASA developer, I think. At any rate, any additional guidance you could provide would be great!

hyphaltip commented 4 months ago

You can see template to start a job. We save the host / port to a conf file that pasa can read.

https://github.com/ucr-hpcc/hpcc_slurm_examples/tree/master/singularity/mariadb

Sent from Gmail Mobile

Jason Stajich - @.***

On Sun, Mar 3, 2024 at 8:27 AM Jason Gallant @.***> wrote:

Hi Jason

Curious if you could provide more details about how you configure mysql with the singularity container on your cluster? I'm currently running the pipeline on a fresh EC2 install with 32 processors, and have been running into tons of issues trying to make MYSQL work properly. I realize the funannotate2 docker instance doesn't support MYSQL out of the box...

Essentially, I had to go through each of the PASA scripts and manually add a "socket" parameter to get it to connect to MYSQL, and change the default characterset in MYSQL to get it to even work. Things hummed along pretty nicely for a while, but I'm currently stuck at the assemble_clusters.dbi portion of the PASA pipeline.

Ultimately, I think this is due to underlying issues some inefficiencies in how threads are handled in this code. There seems to be some problems with how threading works that cause random segfaults. I just keep restarting and then it proceeds, but it is asymptotic as the chromosomes get larger and there are more genes on them. That's a known issue by the PASA developer, I think. At any rate, any additional guidance you could provide would be great!

— Reply to this email directly, view it on GitHub https://github.com/nextgenusfs/funannotate/issues/984#issuecomment-1975218312, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAL5O554YMXSFTKHREENBLYWNFN3AVCNFSM6AAAAAA76DOJA6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNZVGIYTQMZRGI . You are receiving this because you modified the open/close state.Message ID: @.***>

hyphaltip commented 4 months ago

I also don't change any scripts but simply setup pasa.config.txt and I think we just tweak that is located in user home directory when running pasa

jasongallant commented 3 months ago

Hi Jason-

Oh my gosh, this works so much better! Super helpful. Thanks for the tip, it is really appreciated!

Humming along now on our HPC. Aside from my “hacks” for MySQL, I think another of the issues may have been the relatively slow performance of the EFS that I was using.

Cheers, Jason

-- [Logo Description automatically generated]

Dr. Jason R. Gallant Associate Chair & Associate Professor Department of Integrative Biology Michigan State University East Lansing, MI 48824 @.**@.> office:517-884-7756<tel:(517)%20884-7756> http://efish.integrativebiology.msu.edu

@.?anonymous&ep=bwmEmailSignature>Click here to book time to meet with @.?anonymous&ep=bwmEmailSignature>!

From: Jason Stajich @.> Date: Sunday, March 3, 2024 at 12:03 PM To: nextgenusfs/funannotate @.> Cc: Gallant, Jason @.>, Comment @.> Subject: Re: [nextgenusfs/funannotate] Can we set up a point about pasa on update stage? (Issue #984)

I also don't change any scripts but simply setup pasa.config.txt and I think we just tweak that is located in user home directory when running pasa

— Reply to this email directly, view it on GitHubhttps://urldefense.com/v3/__https:/github.com/nextgenusfs/funannotate/issues/984*issuecomment-1975231202__;Iw!!HXCxUKc!x2Qs0FB9ybgWkE8JXYq7zIMjEJRFfRRWfhmEOjyIQgTQq1EIHqfDKev70x7tBEKzH5cUd-k4bDwPtSqRZaRI0zJ0$, or unsubscribehttps://urldefense.com/v3/__https:/github.com/notifications/unsubscribe-auth/AAL4HSWKBMTECEDOA6WAUTTYWNJXNAVCNFSM6AAAAAA76DOJA6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNZVGIZTCMRQGI__;!!HXCxUKc!x2Qs0FB9ybgWkE8JXYq7zIMjEJRFfRRWfhmEOjyIQgTQq1EIHqfDKev70x7tBEKzH5cUd-k4bDwPtSqRZXnCTYGp$. You are receiving this because you commented.Message ID: @.***>