ShunOuchi / GreenHill

De novo chromosome-level scaffolding and phasing tool using Hi-C
GNU General Public License v3.0
25 stars 2 forks source link

Add resume function? #22

Closed annabel-NZ closed 8 months ago

annabel-NZ commented 8 months ago

Hi @ShunOuchi. I'm excited to test this software for our highly heterozygous plant genome assembly work. Thank you. Like several other users the runs are taking quite a lot of time. Guesstimating run-time is tricky when testing a new tool and our SLURM job-scheduler requires us to submit resource limits. My first attempt looked to be running nicely but after it failed to complete in 3 days (500Mb genome) I have had to start again from scratch and some of the early steps had completed fine. It would be wonderful if you could code in a resume function. Some additional guidance on resource requirements in the docs would be helpful too. Thank you.

ShunOuchi commented 8 months ago

Hi @annabel-NZ,

It would be wonderful if you could code in a resume function.

Thank you for the grate suggestion. I would consider adding a resume function in a future update.

Some additional guidance on resource requirements in the docs would be helpful too.

I have added the required resource information for several benchmarks to the README.

Thank you, Shun

annabel-NZ commented 8 months ago

@ShunOuchi Thank you. I also found the supplementary table for resource use after I had posted the issue.

I have a run in progress with ONT long read plus HiC data. The log reports that the makeHiCLink step has completed, and the job is not reporting errors but nothing more has been written to outputs or to the log for 36h. Is this next step expected to be slow or is something likely not right?

makeHiCLink
finish makeHiCLink
numHiCNode:60
ShunOuchi commented 8 months ago

In the next step, Hi-C Scaffolding is performed, which takes long time because it is iterated with gradually increasing the L threshold value when creating the link.

annabel-NZ commented 8 months ago

@ShunOuchi Thank you so much- really reassuring to know this is expected.

girmab commented 4 months ago

Hi @ShunOuchi,

Thanks for this interesting and promising tool, particularly for scaffolding of highly heterozygous and polyploid plant genomes. The biggest challenge is the running time - it took 2-weeks for one of my run to complete. And another run failed after 20days at hic-scaffolding stage. I could have set the time to 30days, but set to 20days based on the first run. Resume option is therefor a MUST, and it would be great if this is in your top-priority list for the next update. Thanks again!