Closed skyungyong closed 2 years ago
Dear Skyungyong, first of all thank you very much for your interest in using our software.
Assumptions: Your genome is around 1GBp I assume (meaning 1,000,000,000). I am assuming you have a large computation cluster.
Suggestions: Depending on your hardware setup, I guess you need to be a little more patient. Another suggestion I have: Split up the genome in different parts, and treat them as different projects. This way it can also get faster.
Background from my side: When developing the tool, I had access to a large computation cluster and annotated these genomes in parallel. In total, it took me around three weeks (meaning ~504h) where at the end I was waiting for the longer genomes only. In the following table you find genomes that I used reasonaTE with (not all are reported in the paper):
Request for updates: Please let me know once you completed the job, and let me know about your experience in terms of runtime, I would be happy to know, so that I can consider that during the next update of the software.
Best regards, Kevin Riehl
Hi @DerKevinRiehl,
Thank you for the information! I guess I will have to let this run for quite some time then. At least now I know how long this takes, so I will stay patient! I will report once the jobs are done.
Thank you!
Hi @DerKevinRiehl,
The process ran for about two weeks, and there was an issue with the computer that terminated the jobs :(. The software won't restart the work from where it was stopped, right?
Dear Skyungyong, sorry to hear that. Unfortunately, currently there is no such option. I still hope you can restart? Are you using your own computer or a cluster? Probably you should talk to the administrator to make sure that the cluster runs more reliably.
Best regards, Kevin
Hi @DerKevinRiehl,
There was an unexpected outage due to the weather :(. I eventually had to split the fasta file and process each sequence separately in parallel. I think it took about a day!
Hi,
I have generated all the outputs from the pipelines and try to generate the final output with
reasonaTE -mode pipeline -projectFolder workspace -projectName testProject
My genome is about 1G, and this step is not finishing. I let the software run for ~ 190 hours, but it didn't produce the final outputs. I am rerunning the same job, but this seems to go very slowly. It's been about 90 hours, and this is the latest lines that the software printed:
seq1 cluster13861 Iteration 1 6 0 / 6 ... seq1 cluster46826 Iteration 1 1 0 / 1 ...
Is there a way to speed up this process? How long do you expect this process will run?
Thank you!