SolayMane commented 6 years ago

how to completely uninstall masurca? thank you

alekseyzimin commented 6 years ago

Just delete the entire MaSuRCA folder

SolayMane commented 6 years ago

Dear Zimin,

I have this problem during the assembly :

[jeudi 12 avril 2018, 13:00:47 (UTC+0200)] Processing pe library reads [jeudi 12 avril 2018, 13:00:47 (UTC+0200)] Average PE read length 151 MIN_Q_CHAR: 33 Estimated genome size: 1157137089 [jeudi 12 avril 2018, 13:00:47 (UTC+0200)] Computing super reads from PE Using linking mates Running mega-reads correction/assembly Using mer size 15 for mapping, B=17, d=0.029 Estimated Genome Size 1157137089 Estimated Ploidy 1 Using 50 threads Output prefix mr.41.15.17.0.029 Pacbio coverage <30x, using the longest subreads Coverage of the mega-reads less than 5 -- using the super reads as well Coverage threshold for splitting unitigs is 45 minimum ovl 115 Running assembly /home1/software/masurca/MaSuRCA-3.2.4/bin/mega_reads_assemble_cluster.sh: line 586: CA.mr.41.15.17.0.029.log: Read-only file system Assembly stopped or failed, see CA.mr.41.15.17.0.029.log [jeudi 12 avril 2018, 13:15:45 (UTC+0200)] Assembly stopped or failed, see CA.mr.41.15.17.0.029.log

you find in the attachement the log file Thank you in davance for your help.

On 7 March 2018 at 19:37, Aleksey Zimin notifications@github.com wrote:

Just delete the entire MaSuRCA folder

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/alekseyzimin/masurca/issues/2#issuecomment-371239090, or mute the thread https://github.com/notifications/unsubscribe-auth/AVm1zKsKbQEqcqVqDZ6CNdmnYRwoG2lIks5tcCj2gaJpZM4R-QGy .

alekseyzimin commented 6 years ago

Hi, please check your disk mount -- the assembler could not write log file to disk. Then re-generate assemble.sh and re-run.

SolayMane commented 6 years ago

Thank you Aleksey Zimin, it's a problem of space, I'll will fix that,

have a nice day.

On 12 April 2018 at 14:48, Aleksey Zimin notifications@github.com wrote:

Hi, please check your disk mount -- the assembler could not write log file to disk. Then re-generate assemble.sh and re-run.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/alekseyzimin/masurca/issues/2#issuecomment-380811485, or mute the thread https://github.com/notifications/unsubscribe-auth/AVm1zJhvSZeKgMoXIo5_cdV1zluvZTQRks5tn1tIgaJpZM4R-QGy .

SolayMane commented 6 years ago

Dear Zimin,

I have launched masurca with the config file bellow :

PARAMETERS

set this to 1 if your Illumina jumping library reads are shorter than 100bp

EXTEND_JUMP_READS=0

this is k-mer size for deBruijn graph values between 25 and 127 are

supported, auto will compute the optimal size based on the read data and GC content GRAPH_KMER_SIZE = auto

set this to 1 for all Illumina-only assemblies

set this to 1 if you have less than 20x long reads (454, Sanger, Pacbio)

and less than 50x CLONE coverage by Illumina, Sanger or 454 mate pairs

otherwise keep at 0

USE_LINKING_MATES = 1

specifies whether to run mega-reads correction on the grid

USE_GRID=0

specifies queue to use when running on the grid MANDATORY

GRID_QUEUE=all.q

batch size in the amount of long read sequence for each batch on the grid

GRID_BATCH_SIZE=300000000

coverage by the longest Long reads to use

LHE_COVERAGE=30

this parameter is useful if you have too many Illumina jumping library

mates. Typically set it to 60 for bacteria and 300 for the other organisms LIMIT_JUMP_COVERAGE = 300

these are the additional parameters to Celera Assembler. do not worry

about performance, number or processors or batch sizes -- these are computed automatically.

set cgwErrorRate=0.25 for bacteria and 0.1<=cgwErrorRate<=0.15 for other

organisms. CA_PARAMETERS = cgwErrorRate=0.15

minimum count k-mers used in error correction 1 means all k-mers are

used. one can increase to 2 if Illumina coverage >100 KMER_COUNT_THRESHOLD = 1

whether to attempt to close gaps in scaffolds with Illumina data

CLOSE_GAPS=1

auto-detected number of cpus to use

NUM_THREADS = 50

this is mandatory jellyfish hash size -- a safe value is

estimated_genome_size*estimated_coverage JF_SIZE = 8000000000

set this to 1 to use SOAPdenovo contigging/scaffolding module. Assembly

will be worse but will run faster. Useful for very large (>5Gbp) genomes from Illumina-only data SOAP_ASSEMBLY=0 END

It is taking until today 5 days , Its is normal to take this long time runining? I have Illumina data with about 180X (2X150) and Pacbio reads with 5X in covrage. Estimated genome size is 850Mb

thank you,

On 12 April 2018 at 15:01, Slimane khayi slimane.khayi@gmail.com wrote:

Thank you Aleksey Zimin, it's a problem of space, I'll will fix that,

have a nice day.

On 12 April 2018 at 14:48, Aleksey Zimin notifications@github.com wrote:

Hi, please check your disk mount -- the assembler could not write log file to disk. Then re-generate assemble.sh and re-run.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/alekseyzimin/masurca/issues/2#issuecomment-380811485, or mute the thread https://github.com/notifications/unsubscribe-auth/AVm1zJhvSZeKgMoXIo5_cdV1zluvZTQRks5tn1tIgaJpZM4R-QGy .

alekseyzimin commented 6 years ago

180x Illumina may be too much, I would use 120x. Depending on your system it could take a week or so.

alekseyzimin / masurca

how to completely uninstall masurca #2

set this to 1 if your Illumina jumping library reads are shorter than 100bp

this is k-mer size for deBruijn graph values between 25 and 127 are

set this to 1 for all Illumina-only assemblies

set this to 1 if you have less than 20x long reads (454, Sanger, Pacbio)

otherwise keep at 0

specifies whether to run mega-reads correction on the grid

specifies queue to use when running on the grid MANDATORY

batch size in the amount of long read sequence for each batch on the grid

coverage by the longest Long reads to use

this parameter is useful if you have too many Illumina jumping library

these are the additional parameters to Celera Assembler. do not worry

set cgwErrorRate=0.25 for bacteria and 0.1<=cgwErrorRate<=0.15 for other

minimum count k-mers used in error correction 1 means all k-mers are

whether to attempt to close gaps in scaffolds with Illumina data

auto-detected number of cpus to use

this is mandatory jellyfish hash size -- a safe value is

set this to 1 to use SOAPdenovo contigging/scaffolding module. Assembly