cancerit / BRASS

Breakpoints via assembly - Identifies breaks and attempts to assemble rearrangements in whole genome sequencing data.
GNU Affero General Public License v3.0
57 stars 20 forks source link

Clarification: Penalty for finding aberrant pairs for PON separately vs jointly #72

Closed edawson closed 6 years ago

edawson commented 6 years ago

Due to wacky limitations on the way the cloud analysis system I'm using works, it's impossible for me to run the first step in PON filter generation (brassI_np_in.pl) jointly across all samples at once. At best I could run subsets of maybe 5-10 out of a total 393 samples.

Is there a penalty to doing this step separately for each normal BAM? My intuition is that this stage is just collecting aberrant read pairs, which should be independent across samples. My understanding of the wiki page is that this is the case since the next stage is to actually merge the brm.bam files.

keiranmraine commented 6 years ago

@edawson the script just allows you to use a single command and vary the FILE_INDEX, so that you can use a job array. It actually only processes a single file at a time so you can happily schedule this as 300 separate jobs with only a single input BAM each, just ensure that FILE_INDEX is set to 1 for them all.