MLBazaar / MLBlocks

A library for composing end-to-end tunable machine learning pipelines.
https://mlbazaar.github.io/MLBlocks
MIT License
114 stars 35 forks source link

Ability to return intermediate context #110

Closed csala closed 5 years ago

csala commented 5 years ago

As a result of the development of issue #104, one can now get all the output variables form a single block by specifying either the block index or the block name. However, the ability to return the entire pipeline context right after the block has been executed has been lost.

Since the old behavior allowed easy execution of the pipeline step by step and the new one can easily be achieved by specifying each of the wanted variables individually, this goal of this issue is to recover part of the old functionality.

The new behavior should be:

Outputs specification can either be a single element or a list of elements, and each of the elements can be:

Named outputs and full variable names will work exactly as they did in issue the #104 implementation.

Block indexes and Block names will be translated to a single variable specification, named after the block name, and which will be outputted as a deepcopy of the context after the execution of that block.

Examples

Get the variables form an output called debug:

>>> pipeline.get_outputs('debug')
[{'name': 'a_name', 'variable': 'a_primitive_name#1.a_variable_name'}]
>>> a_name = pipeline.predict(..., output_='debug')

Get a single variable:

>>> pipeline.get_outputs('a_primitive_name#1.a_variable_name')
[{'name': 'a_primitive_name#1.a_variable_name',
  'variable': 'a_primitive_name#1.a_variable_name'}]
>>> a_variable_name = pipeline.predict(..., output_='a_primitive_name#1.a_variable_name')

Get the complete context after the block called a_primitive_name#1:

>>> pipeline.get_outputs('a_primitive_name#1')
[{'name': 'a_primitive_name#1',
  'variable': 'a_primitive_name#1'}]
>>> context_after_block_1 = pipeline,predict(..., output_=1)

Get the complete context after the second block:

>>> pipeline.get_outputs(1)
[{'name': 'a_primitive_name#1',
  'variable': 'a_primitive_name#1'}]
>>> context_after_block_1 = pipeline,predict(..., output_=1)

The behavior should be consistent with a list specification:

>>> pipeline.get_outputs([0, 1, 2])
[{'name': 'first_primitive_name#1',
  'variable': 'first_primitive_name#1'},
 {'name': 'second_primitive_name#1',
  'variable': 'second_primitive_name#1'},
 {'name': 'third_primitive_name#1',
  'variable': 'third_primitive_name#1'}]
>>> context_0, context_1, context_2 = pipeline,predict(..., output_=[0, 1, 2])