Closed dpark01 closed 7 years ago
Also might be helpful in these scenarios to expose an option in the demux
wrapper that allows you to specify the instance type of the illumina_demux
jobs that spawn off (right now, the user can only change the instance type of the demux
outer wrapper which isn't really helpful.
revised all the instance types during the dockerization I see the demux jobs using a fair amount of disk, but haven't found examples where it uses a lot of memory...is that only under certain circumstances? (human depletion in contrast does both use a lot of disk and spiky/high memory usage)
Great. I think I saw high RAM use in a previous scenario where we had the Picard --threads parameter set too high... it's linearly proportional to the thread count. If you've seen hiseq lanes fit in a lower RAM footprint, that's great. High disk usage makes sense because of all the untarring and temp files.
Huh, looking at it, it seems you're right that demux memory usage seems actually quite minimal when the thread count is controlled. I wonder if runtime might benefit from more core count. Or LZ4 uploads instead of gzip.
Fixed a previous error in the instance type selection in f7fd31eb
Looking at
job-BzVzv5j0jy12PPyv782QXZV3
it appears that hiseq lanes get dispatched on a mem1_ssd1_x4. In my own experience, I think we'd want a couple hundred GB of local instance storage at a minimum as well as at least 40-50GB RAM. I might suggest a mem3_ssd1_x8 or possibly a mem1_ssd2_x16 (if local storage is the limiting factor). Feel free to play with the associated flowcell of data as a test case.