Open Al-Murphy opened 6 months ago
Just as an update, I tried hard coding a different seed in subsample_ta_se in encode_lib_genomic.py
(my data is single-ended). However, the resulting pooled bigwig file data was the exact same as when only setting atac.subsample_reads
. So it appears that setting the seed is having no affect, not sure where else this is having an effect, I can't see it anywhere in the atac.wdl code?
Updated code:
# bash-only
cmd = 'zcat -f {} | '
if non_mito:
# cmd += 'awk \'{{if ($1!="'+mito_chr_name+'") print $0}}\' | '
cmd += 'grep -v \'^'+mito_chr_name+'\\b\' | '
if subsample > 0:
cmd += 'shuf -n {} --random-source=<(openssl enc -aes-256-ctr '
#CHANGE HERE
#cmd += '-pass pass:$(zcat -f {} | wc -c) -nosalt '
cmd += '-pass pass:101 -nosalt '
cmd += '</dev/zero 2>/dev/null) | '
cmd += 'gzip -nc > {}'
cmd = cmd.format(
ta,
subsample,
ta_subsampled)
else:
cmd += 'gzip -nc > {}'
cmd = cmd.format(
ta,
ta_subsampled)
I have been using
atac.subsample_reads
but I would like to a seed for this analysis so I can ensure different reads are subsampled across different runs, is there a way to do this? I can't see it listed in the parameters?Thanks!