neurospin / pylearn-epac

Embarrassingly Parallel Array Computing: EPAC is a machine learning workflow builder.
BSD 3-Clause "New" or "Revised" License
12 stars 3 forks source link

Regular expression does not work in BaseNode.get_node #36

Closed kribou closed 11 years ago

kribou commented 11 years ago

Regular expression does not work in BaseNode.get_node

>>> from epac import CV, Methods, Pipe
>>> from sklearn.svm import SVC
>>> from sklearn.lda import LDA
>>> from sklearn.feature_selection import SelectKBest
>>> y = [1, 1, 2, 2]
>>> wf = CV(Methods(*[Pipe(SelectKBest(k=k), SVC()) \
... for k in [1, 5]]), n_folds=2, y=y)
>>> 
>>> for n in wf.walk_leaves(): print n.get_key()
... 
CV/CV(nb=0)/Methods/SelectKBest(k=1)/SVC
CV/CV(nb=0)/Methods/SelectKBest(k=5)/SVC
CV/CV(nb=1)/Methods/SelectKBest(k=1)/SVC
CV/CV(nb=1)/Methods/SelectKBest(k=5)/SVC
>>> wf.get_node(key="CV/CV(nb=1)/Methods/SelectKBest(k=1)/SVC").get_key()
'CV/CV(nb=1)/Methods/SelectKBest(k=1)/SVC'
>>> for n in wf.get_node(regexp="CV/*"):
...     print n.get_key()
... 
CV/CV(nb=1)
CV/CV(nb=1)

What we expect results are:

CV/CV(nb=0)
CV/CV(nb=1)
JinpengLI commented 11 years ago

Since we create a VirtualList which re-use node memory, the node will change during iteration. This bug has been fixed by returning only key strings instead of nodes.

>>> from epac import CV, Methods, Pipe
>>> from sklearn.svm import SVC
>>> from sklearn.lda import LDA
>>> from sklearn.feature_selection import SelectKBest
>>> y = [1, 1, 2, 2]
>>> wf = CV(Methods(*[Pipe(SelectKBest(k=k), SVC())
...     for k in [1, 5]]), n_folds=2, y=y)
>>> node1 = wf.get_node("CV/CV(nb=0)")
>>> node1
CV/CV(nb=0)
>>> node2 = wf.get_node("CV/CV(nb=1)")
>>> node2
CV/CV(nb=1)
>>> node1
CV/CV(nb=1)