IndicoDataSolutions / Passage

A little library for text analysis with RNNs.
MIT License
530 stars 134 forks source link

Trying to apply Passage to regression problem #24

Closed agogolev closed 9 years ago

agogolev commented 9 years ago

Hello,

It's more a question than a bug report, cause I'm not sure that Passage is meant to work for my problem. It's definitely regression problem but the values are very chaotic and regular regression approach doesn't work well. So I hoped that I can apply Passage to regression problem just by passing 'linear' to activation functions of each layer.

So I have time series and data looks like this (I simplified the data to make example more obvious):

step 1: [34, 53, 10]
step 2: [23, 14, 77]
step 3: [12, 43, 90]
step 4: [93, 22, 31]
step 5: [1, 10, 53]

I'm trying to predict next step value using data from previous two steps. So X and Y looks like this:

X = [
    [34, 53, 10] + [23, 14, 77],
    [23, 14, 77] + [12, 43, 90],
    [12, 43, 90] + [93, 22, 31]
]

Y = [
    [12, 43, 90],
    [93, 22, 31],
    [1, 10, 53]
]

When I call .fit() I see that Passage trying to treat vectors from X as sparse vectors. It assumes that, for example, 34 number (X[0][0] value) is index but not a value.

So I have two questions.

  1. Is it possible to apply Passage to regression problems?
  2. How to tell Passage to treat data as usual vectors and not sparse vectors?

Thank you

Newmu commented 9 years ago

For real valued sequences, use the Generic layer, see the mnist rnn in examples for usage. Yes you should be able to do regression with Passage. You will want to change cost to 'mse' or 'mae' for mean squared or mean average error.

Slater-Victoroff commented 9 years ago

Going to close this issue as it seems resolved by @Newmu's answer, and is closer to a question.

Feel free to re-open or open a new issue if you have more questions.

agogolev commented 9 years ago

@Newmu @Slater-Victoroff

Thank you for the quick response.

One more question tough. mnist.py example fails for me locally (Ubuntu 14.04). Can you please confirm that mnist.py example works for you? I just want to be sure that it's a problem with my local theano setup.

Here is the error trace I've got:

Traceback (most recent call last):
  File "/home/alex/w/passage/examples/mnist.py", line 23, in <module>
    model = RNN(layers=layers, updater=updater, iterator='linear', cost='cce')
  File "/home/alex/w/passage/passage/models.py", line 51, in __init__
    self._train = theano.function([self.X, self.Y], cost, updates=self.updates)
  File "/home/alex/v/tango/local/lib/python2.7/site-packages/theano/compile/function.py", line 223, in function
    profile=profile)
  File "/home/alex/v/tango/local/lib/python2.7/site-packages/theano/compile/pfunc.py", line 512, in pfunc
    on_unused_input=on_unused_input)
  File "/home/alex/v/tango/local/lib/python2.7/site-packages/theano/compile/function_module.py", line 1312, in orig_function
    defaults)
  File "/home/alex/v/tango/local/lib/python2.7/site-packages/theano/compile/function_module.py", line 1181, in create
    _fn, _i, _o = self.linker.make_thunk(input_storage=input_storage_lists)
  File "/home/alex/v/tango/local/lib/python2.7/site-packages/theano/gof/link.py", line 434, in make_thunk
    output_storage=output_storage)[:3]
  File "/home/alex/v/tango/local/lib/python2.7/site-packages/theano/gof/vm.py", line 847, in make_all
    no_recycling))
  File "/home/alex/v/tango/local/lib/python2.7/site-packages/theano/gof/op.py", line 606, in make_thunk
    output_storage=node_output_storage)
  File "/home/alex/v/tango/local/lib/python2.7/site-packages/theano/gof/cc.py", line 948, in make_thunk
    keep_lock=keep_lock)
  File "/home/alex/v/tango/local/lib/python2.7/site-packages/theano/gof/cc.py", line 891, in __compile__
    keep_lock=keep_lock)
  File "/home/alex/v/tango/local/lib/python2.7/site-packages/theano/gof/cc.py", line 1322, in cthunk_factory
    key=key, fn=self.compile_cmodule_by_step, keep_lock=keep_lock)
  File "/home/alex/v/tango/local/lib/python2.7/site-packages/theano/gof/cmodule.py", line 996, in module_from_key
    module = next(compile_steps)
  File "/home/alex/v/tango/local/lib/python2.7/site-packages/theano/gof/cc.py", line 1237, in compile_cmodule_by_step
    preargs=preargs)
  File "/home/alex/v/tango/local/lib/python2.7/site-packages/theano/gof/cmodule.py", line 1971, in compile_str
    (status, compile_stderr.replace('\n', '. ')))
Exception: ('The following error happened while compiling the node', Gemm{no_inplace}(<TensorType(float64, matrix)>, TensorConstant{0.01}, InplaceDimShuffle{1,0}.0, Elemwise{Composite{[Composite{[Composite{[add(i0, neg(i1))]}(i0, mul(i1, i2))]}(i0, EQ(i1, i2), i3)]}}[(0, 0)].0, TensorConstant{1e-06}), '\n', 'Compilation failed (return status=1): /usr/bin/ld: cannot find -lblas. collect2: error: ld returned 1 exit status. ', '[Gemm{no_inplace}(<TensorType(float64, matrix)>, TensorConstant{0.01}, <TensorType(float64, matrix)>, <TensorType(float64, matrix)>, TensorConstant{1e-06})]')
agogolev commented 9 years ago

@Slater-Victoroff Can't reopen the issue, no 'reopen' button on my screen.

Newmu commented 9 years ago

/usr/bin/ld: cannot find -lblas is a theano setup issue. Look up the config guide http://deeplearning.net/software/theano/library/config.html and specifically check which blas library is installed and which you have linked in theanorc.

agogolev commented 9 years ago

@Newmu Thanks, that helped.