Closed edawson closed 6 years ago
@edawson the script just allows you to use a single command and vary the FILE_INDEX
, so that you can use a job array. It actually only processes a single file at a time so you can happily schedule this as 300 separate jobs with only a single input BAM each, just ensure that FILE_INDEX
is set to 1 for them all.
Due to wacky limitations on the way the cloud analysis system I'm using works, it's impossible for me to run the first step in PON filter generation (
brassI_np_in.pl
) jointly across all samples at once. At best I could run subsets of maybe 5-10 out of a total 393 samples.Is there a penalty to doing this step separately for each normal BAM? My intuition is that this stage is just collecting aberrant read pairs, which should be independent across samples. My understanding of the wiki page is that this is the case since the next stage is to actually merge the brm.bam files.