lemontealala / RegScaf

RegScaf: An accurate and robust regression method of scaffolding based on clustered links
7 stars 0 forks source link

Running time doubt #1

Open V-JJ opened 2 years ago

V-JJ commented 2 years ago

Hello!

I would like to know if there are any estimations on how long it takes to end a RegScaf job. Our job has been running for a week now with these parameters.

thread, 96
short_seme_penal, 2
long_seme_penal, 2
BlasrminMatch, 10
reg_m, 12
ScafIter, 3

Our input data includes: Illumina reads, 92Gb PacBio reads and a 1.7 Gb wtdbg2 assembly.

Any comment or recommendation will be appreciate.

Regards,

lemontealala commented 2 years ago
font{
    line-height: 1.6;
}
ul,ol{
    padding-left: 20px;
    list-style-position: inside;
}

Dear Vadim,Thanks for trying our method! Though RegScaf has been tested on a Pika genome, whose size is about 2.5Gb, it has not been tested on a large genome in the hybrid scaffolding mode (using both illumina reads and 3GS long reads). I am very pleased to help you solve your problem. But I need to know more about your job details:The size of your illumina data sets and TGS long reads,respectively? SEME mapping for large illumina datasets, as well as Blasr mapping for high depth long reads will take long time (it is normal to take over a week). If the mapping step has not finished, maybe you can stop the current job and restart using less libraries (Mapping steps for already finished libraries can be commented and skipped in the new run). For scaffolding, jumping MP illumina libraries are more helpful.How many contigs are in your wtdbg2 assembly? If there are too many contigs (over 50000, for example), RegScaf will take long time to handle the graph and the regression procedure; If that's the case, a larger reg_m may help. Maybe try reg_m=40.Do you know which process is running at present? For now, ScaffoldGraph-LTS-multimean.py does not use multithreads, so it will take a long time if there are too many super-contigs in the previous iteration. You can try less iterations by setting ScafIter=2.If you have more questions, do not hesitate to ask. The details in our published article https://doi.org/10.1093/bioinformatics/btac174 may be helpful.Regards,Mengtian Li

                ***@***.***

On 4/21/2022 15:31,Vadim A. ***@***.***> wrote: 

Hello! I would like to know if there are any estimations on how long it takes to end a RegScaf job. Our job has been running for a week now with these parameters. thread, 96 short_seme_penal, 2 long_seme_penal, 2 BlasrminMatch, 10 reg_m, 12 ScafIter, 3

Our input data includes: Illumina reads, 92Gb PacBio reads and a 1.7 Gb wtdbg2 assembly. Any comment or recommendation will be appreciate. Regards,

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you are subscribed to this thread.Message ID: @.***>

V-JJ commented 2 years ago

Good morning!

Thanks for the fast reply.

We have:

We have recently restarted our job with fewer threads given its time consumption. And we will wait for its completion. Afterwards, I will tell you what is the result.

Regards,

Vadim