Run takes a very long time

sagnikbanerjee15 commented 6 years ago

Hello,

Could you please tell me how long does it take to complete a run? I am running ColorMap with 10M PacBio reads and 2512M Illumina reads.

Thanks.

haghshenas commented 6 years ago

Hi Sagnik,

In general, mapping Illumina reads onto long reads takes most of the time. Can you tell me how much coverage of Illumina reads you have?

sagnikbanerjee15 commented 6 years ago

Hello,

I have 26M in each of the samples. I used an online calculator and it gives me 735294118X as the coverage.

Thank you.

haghshenas commented 6 years ago

What do you mean by "each of the samples"? Do you have multiple samples? Correcting long reads using short reads makes sense only when they are from the same sample. Can you just clarify a bit more about your data. How many files you have; if these files are from the same sample or not; etc...

sagnikbanerjee15 commented 6 years ago

Hello Ehsan,

RNA-Seq sequencing was done on 90 samples (5 genotypes, 6 timepoints and 3 replicates). The RNA from all the 90 samples were pooled together and PacBio sequencing was done. So I am combining all the short reads from the 90 samples together and presenting that large file to ColorMap. So essentially both the short reads and the pacbio reads come from the same tissue.

Thank you.

Sagnik Banerjee Graduate Research Assistant Bioinformatics and Computational Biology Department of Plant Pathology and Microbiology Dr. Roger Wise's Lab Iowa State University

*"The moment I have realized God sitting in the temple of every human body, the moment I stand in reverence before every human being and see God in him

that moment I am free from bondage, everything that binds vanishes, and I am free" - Swami Vivekananda*

On Thu, May 10, 2018 at 12:29 PM, Ehsan Haghshenas <notifications@github.com

wrote:

What do you mean by "each of the samples"? Do you have multiple samples? Correcting long reads using short reads makes sense only when they are from the same sample. Can you just clarify a bit more about your data. How many files you have; if these files are from the same sample or not; etc...

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/sfu-compbio/colormap/issues/7#issuecomment-388124881, or mute the thread https://github.com/notifications/unsubscribe-auth/AWXwuRhBeN5f1HBXn-VRfNcQXRidzJMbks5txHjugaJpZM4T4Z06 .

haghshenas commented 6 years ago

If you are sure that all short reads are coming from the same sample/individual then use only about 50x coverage of short reads to correct long reads using CoLoRMap. Using more short reads does not increase the accuracy significantly but makes the mapping step very slow.

sagnikbanerjee15 commented 6 years ago

All right Thank you. ᐧ

Sagnik Banerjee Graduate Research Assistant Bioinformatics and Computational Biology Department of Plant Pathology and Microbiology Dr. Roger Wise's Lab Iowa State University

*"The moment I have realized God sitting in the temple of every human body, the moment I stand in reverence before every human being and see God in him

that moment I am free from bondage, everything that binds vanishes, and I am free" - Swami Vivekananda*

On Mon, May 14, 2018 at 12:45 PM, Ehsan Haghshenas <notifications@github.com

wrote:

If you are sure that all short reads are coming from the same sample/individual then use only about 50x coverage of short reads to correct long reads using CoLoRMap. Using more short reads does not increase the accuracy significantly but makes the mapping step very slow.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/sfu-compbio/colormap/issues/7#issuecomment-388903447, or mute the thread https://github.com/notifications/unsubscribe-auth/AWXwucpMuX7QK3CHKxelTBr44NoahLH5ks5tycLLgaJpZM4T4Z06 .

alphahmed18 commented 6 years ago

I'm facing the same problem of long running time. My PacBio coverage is only 6x and my illumina coverage is 30x, but it's been running now for three weeks, just keeps going back and forth between "$/colormap/bin/bwa-proovread mem" and "$/colormap/bin/bwa-proovread index"!

sfu-compbio / colormap

Run takes a very long time #7