Closed lzamparo closed 8 years ago
Btw $1, $2 resolve to narrowPeak files. I found the problem, it's in the parsing code at IDR/load_bed.py:
def load_samples(args):
# decide what aggregation function to use for peaks that need to be merged
idr.log("Loading the peak files", 'VERBOSE')
if args.input_file_type in ['narrowPeak', 'broadPeak']:
if args.rank == None: signal_type = 'signal.value'
else: signal_type = args.rank
try:
signal_index = {"score": 4, "signal.value": 6,
"p.value": 7, "q.value": 8}[signal_type]
except KeyError:
raise ValueError(
"Unrecognized signal type for {} filetype: '{}'".format(
args.input_file_type, signal_type))
if args.peak_merge_method != None:
peak_merge_fn = {
"sum": sum, "avg": mean, "min": min, "max": max}[
args.peak_merge_method]
elif signal_index in (4,6):
peak_merge_fn = sum
else:
peak_merge_fn = min
if args.input_file_type == 'narrowPeak':
summit_index = 9 ### <--- this was causing it for me, since I throw away
### everything past column 8
I'll work on a PR that validates narrowPeak, broadPeak files.
When I call idr thusly:
idr --verbose --samples $1 $2 --input-file-type narrowPeak --rank p.value -o $outdir/$outfile 2>$outdir/idr-errors.txt
IDR raises a Value Error about the column I'm using to rank my peaks:
I'm trying to rank by p.value of narrowPeak files. The usage string (and other issues in this repo) suggest that
--rank p.value
is the proper way to indicate ranking by P value. Any ideas why I'm seeing this?My version: