uec / Issue.Tracker

Automatically exported from code.google.com/p/usc-epigenome-center
0 stars 0 forks source link

parallel nextseq fastq merging tool #814

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
since the nextseq makes 4 separate lanes for each samples (unnecessarily so). I 
have a tool that uses pigz to create merge them all in one-shot.
/auto/uec-00/ramjan/devel/mergeNextSeqLanes/mergeNextseqLanes.pl

#!/usr/bin/perl
$pigz = "/home/uec-00/shared/production/software/pigz/pigz";
@files = glob("*_L001_R*.fastq.gz");

for my $f (@files)
{
    my $output = $f;
    $output =~  s/_L001_R/_L000_R/;
    warn "$output already exists. SKIPPING. You can delete $output by hand and try again" if -e $output;
    next if -e $output;
    my $fileList;

    for my $i (1..4)
    {
        $f =~ s/_L00\d_R/_L00$i\_R/;
        $fileList .= "$f ";
        die "$f expected to exist but not found\n" unless -e $f;

    }
    runcmd("$pigz -d -c $fileList | $pigz > $output");

}

sub runcmd{
    my $cmd=shift @_;
    my $caller=(caller(1))[3];
    print STDERR "$caller\t$cmd\n";
    system($cmd);
}

Original issue reported on code.google.com by zack...@gmail.com on 23 Sep 2014 at 10:41

GoogleCodeExporter commented 8 years ago
Note: if pigz is unavailable, replace line #2 with 

old:
$pigz = "/home/uec-00/shared/production/software/pigz/pigz";

new:
$pigz = "gzip";

Original comment by zack...@gmail.com on 26 Sep 2014 at 10:27