broadinstitute / pilon

Pilon is an automated genome assembly improvement and variant detection tool
GNU General Public License v2.0
341 stars 60 forks source link

two question #14

Closed mjfi2sb3 closed 8 years ago

mjfi2sb3 commented 8 years ago

Hi, 1) my assembly has 761419 scaffolds. I would like to run pilon but it's taking too long. Would it be reasonable to use a much small dataset of scaffolds >=5kbp?

2) Would it work if I split my genome file into smaller fasta files to run in parallel?

Regards, /SB

w1bw commented 8 years ago

Hi Salim,

The answer to both questions is yes. You can either break up the input file or use the --targets option to tell Pilon to only process a subset of the input (typically by pointing it to a file listing the scaffolds to process). How large is the genome?

mjfi2sb3 commented 8 years ago

Hi,

thank you for the reply. The genome size is somewhere around 700 Mbp