Closed jmaggio14 closed 4 years ago
special block that runs pipelines
(edit with details)
I propose "node" like syntax that combines the best features of the 2 classes. In this model, a Pipeline is a special subclass of Block, whose process_strategy
/ train_strategy
simply validates and calls a number of blocks.
These definitions show the primary methods/ attributes that a block and a pipeline have currently. There is quite a bit of overlap,with notable exceptions in bold.
Block:
name
*IO Mapping* - shape, type
train() - if implemented
processing methods / hooks
before()
process()
label()
after()
Pipeline:
name
*list methods - add, del, copy, etc.*
*validate*
*_step*
training methods / hooks
before()
process()
label()
after()
processing methods / hooks
before()
process()
label()
after()
""" Loads an image, resizes it to 512x512, and converts to grayscale. """
import imagepypelines as ip
test_image = ip.lenna() # :(
resizer = ip.blocks.Resizer(512,512)
color2gray = ip.blocks.Color2Gray()
grayscale_resize_pipeline = ip.Pipeline(blocks=[resizer,color2gray])
output = grayscale_resize_pipeline.run([test_image])
Now, lets use this pipeline as the first "block" in another pipeline.
edgemap_by_freq = ip.blocks.Highpass(cutoff=100)
viewer = ip.blocks.BlockViewer()
nested_pipeline = ip.Pipeline(blocks=[grayscale_resize_pipeline,edgemap_by_freq, viewer])
nested_pipeline.run([test_image])
Fundamentally, this will be the same thing as constructing a linear pipeline with all 5 example blocks. However, it allows more flexible pipeline creation (with unlimited nesting!) and the same power as before (with different training methods, etc). Additionally, this will simplify the syntax the user will have to memorize, because it will be very similar to any Block.
Major downsides of this approach include debugging and validating the block io types. I think both these downsides can be realistically mitigated if we assume blocks will always be used in a pipeline. In other words, we assume a user will never want (or get) those features on an individual block, and just implement them in the pipeline. That way, nothing really changes for the user, but we get nesting for free. This requires there to always be a pipeline container for serialization, debugging, and validation.
as per @natedileas idea