kundajelab / atac_dnase_pipelines

ATAC-seq and DNase-seq processing pipeline
BSD 3-Clause "New" or "Revised" License
162 stars 81 forks source link

The default value of idr_thresh is wrong #6

Closed chrisprobert closed 8 years ago

chrisprobert commented 8 years ago

@leepc12

Summary: The default value of the parameter idr_thresh in atac.bds is wrong. Here in TF_chipseq_pipeline we report it is 0.02 which is exported as the default value, but here in bds_atac it is mysteriously reset to 0.1.

This is an undesirable behavior: we should not give a default parameter value and then overwrite it with another unreported value at runtime. Currently, there is no way to run with 0.02 even if set explicitly with a flag.

Symptom: running bds atac.bds gives:

== idr settings
    -idr_thresh <string>       : IDR threshold : -log_10(score) (default: 0.02).

But this value is overwritten with 0.1 at runtime.

leepc12 commented 8 years ago

You should be able to see the following warning when you start the pipeline.

Default IDR threshold is found (0.02). Setting it to 0.1 ...

Anshul wanted use 0.1 as a default IDR threshold for atac seq. So the pipeline checks if idr_thresh is set to default value (0.02) or not. If it's 0.02, pipeline changes it to 0.1.

Yes, it's kind of misleading that bds atac.bds shows 0.02 instead of 0.1. I will fix this in the next release.

Also, I need to deal with the case where a user EXPLICITLY define IDR threshold as 0.02. This will also be fixed in the next release.

chrisprobert commented 8 years ago

@leepc12: Could we factor out the idr_thresh parameter declaration from the idr module? We could either:

  1. Put the idr_thresh parameter definition and pipeline-specific default values in atac.bds and in chipseq.bds, and then have the idr module read this global parameter (perhaps fall back to 0.02 if it's not set, or just throw an error and exit if it's not set).
  2. Factor out idr_thresh as a global scope variable in the idr module, and just make it a required argument of _idr2. Calling methods can define the parameter however they want (using a pipeline-specific default value), and just pass their value to _idr2.
leepc12 commented 8 years ago

I took 2. and fixed it.