chapmanb / bcbb

Incubator for useful bioinformatics code, primarily in Python and R
http://bcbio.wordpress.com
604 stars 243 forks source link

Memoize refactor #66

Closed roryk closed 11 years ago

roryk commented 11 years ago

Refactored memoize_outfile into two decorators, a memoization decorator and a filename transformation decorator, transform_to.

Transform_to got a new capability that if out_dir is a kwarg in the decorated function, transform_to will use that as the ouput directory instead of the in_file directory.

Added a filter_to decorator that appends a word onto the stem of a filename, so you can generate a new filename of the same type without knowing what that type is.

chapmanb commented 11 years ago

Rory; Thanks for looking at this. I started playing with memoize_outfile but was never really happy with what I came up with.

I like your idea of adding a stem in addition to the extension. Instead of having dependent decorators, what do you think about an implementation that accepts keywords like:

@memoize_outfile(ext="-toadd.bam")

@memoize_outfile(stem="toadd")

Then we could dispatch to the correct file generation based on the keyword passed.

roryk commented 11 years ago

Hi Brad,

I think that is a good idea; it is nice to have the decorators have a single defined function but having to stack the two separate decorators in the correct order to make it work stinks.

Rory

On Jan 16, 2013, at 8:41 PM, Brad Chapman wrote:

Rory; Thanks for looking at this. I started playing with memoize_outfile but was never really happy with what I came up with.

I like your idea of adding a stem in addition to the extension. Instead of having dependent decorators, what do you think about an implementation that accepts keywords like:

@memoize_outfile(ext="-toadd.bam")

@memoize_outfile(stem="toadd") Then we could dispatch to the correct file generation based on the keyword passed.

— Reply to this email directly or view it on GitHub.

roryk commented 11 years ago

Hi Brad,

I think this is all set to go; I am bad at figuring out a way to test bcbio though so I am not breaking things. Maybe you can sit down with me for a little bit and get me going with a minimal setup?

chapmanb commented 11 years ago

Thanks Rory, I got this all merged. The only change was moving the fastq subdirectory into the poorly named 'bam' namespace, which also contains some other fastq processing utils. I used bcbio.fastq in this project, which has some useful utilities:

https://github.com/hbc/projects/tree/master/jl_hiv/bcbio/fastq

Terrible namespace management for sure. It might be worth merging that over into bcbio-nextgen at some point if the functionality is generally useful.