Closed GoogleCodeExporter closed 8 years ago
I’ve created an initial pipeline that follows Anshul Kundaje’s method. The
goal was to automate as much as possible and make it easy to use.
it’s currently available for testing on hpcc at:
/home/uec-00/ramjan/devel/encodePipeline/wrap_encode_pipeline.pl
all you supply are the initial bams, for example:
wrap_encode_pipeline.pl -control control.bam -treated sample1.bam -treated
sample2.bam
you can supply as many “-control” or “-treated” as needed (i.e. 1 or
more)
My tool will handle:
tagAlign conversion
pooling of replicates
creation of pseudoreplicates both with and without pooling (through random
splitting)
SPP peak calling on all the above combinations (i.e. pooled, replicates,
pseudoreps, pooled pseudoreps)
IDR analysis on all relevant combinations (i.e. pooled, replicates,
pseudoreps, pooled pseudoreps)
The most intense step is SPP peak calling, so it’s been multithreaded to run
as parallel as possible.
The resulting data and reports are clearly organized. You can see the order of
operations, as well as the dependencies and outputs for each step.
let me know of any issues, if all goes well, we will make it available to all
users.
-zack
Original comment by zack...@gmail.com
on 12 May 2014 at 10:29
to save time, i try to run many things in parallel. looks like this can wreak
havoc on mem. will need to move to a worker pool model.
Original comment by zack...@gmail.com
on 13 May 2014 at 10:16
numerous fixes.
added worker pool
changed labelling to limit filename length
Original comment by zack...@gmail.com
on 20 May 2014 at 9:21
it seems pretty stable now
Original comment by zack...@gmail.com
on 31 May 2014 at 6:19
lijing is testing
Original comment by zack...@gmail.com
on 6 Aug 2014 at 10:56
Original comment by zack...@gmail.com
on 13 Aug 2014 at 9:14
Original issue reported on code.google.com by
zack...@gmail.com
on 13 Mar 2014 at 12:27