kzahedi / RNN.jl

RNN package for Julia
Other
3 stars 2 forks source link

Completeness #1

Open hpoit opened 8 years ago

hpoit commented 8 years ago

Hi Keyan,

Would you mind if I did some heavy work on this package to make it as complete as possible?

That should include an example for each of the five cases Karpathy refers to http://karpathy.github.io/2015/05/21/rnn-effectiveness/

Kevin

kzahedi commented 8 years ago

Hi Kevin,

I am very happy to hear, that you are interested in working on this package. I use RNNs to control robots, so this is a must-requirement for my package and it also explain the input-output wrapper functions. I am not so sure how this interferes with your goals. Could you tell me a bit more about your plans, so that we can decide what the best procedure is? What are the type of applications you are looking for?

It could also be that the best solution is to add you as a contributor to RNN.jl, it could also be, that simply including my package (may be with some additions by you) into a new package is better suited for your purpose. That's why I would like to understand a bit better, what you are interested in.

Cheers, Keyan

hpoit commented 8 years ago

Hi Keyan. I am not quite sure what the application is for (artificial dataset), but I think the model is many to one.

This is what the file looks like

julia> showcols(train)
24976x108 DataFrames.DataFrame
| Col # | Name        | Eltype     | Missing |
|-------|-------------|------------|---------|
| 1     | x016399044a | Int64      | 0       |
| 2     | x023c68873b | UTF8String | 0       |
| 3     | x0342faceb5 | Int64      | 0       |
| 4     | x04e7268385 | Int64      | 0       |
| 5     | x06888ceac9 | Int64      | 0       |
| 6     | x072b7e8f27 | Float64    | 0       |
| 7     | x087235d61e | Int64      | 0       |
| 8     | x0b846350ef | Float64    | 0       |
| 9     | x0e2ab0831c | Float64    | 0       |
| 10    | x12eda2d982 | Float64    | 0       |
| 11    | x136c1727c3 | Float64    | 0       |
| 12    | x173b6590ae | Float64    | 0       |
| 13    | x174825d438 | Int64      | 0       |
| 14    | x1f222e3669 | Float64    | 0       |
| 15    | x1f3058af83 | Int64      | 0       |
| 16    | x1fa099bb01 | Int64      | 0       |
| 17    | x20f1afc5c7 | Float64    | 0       |
| 18    | x253eb5ef11 | Float64    | 0       |
| 19    | x25bbf0e7e7 | Int64      | 0       |
| 20    | x2719b72c0d | Float64    | 0       |
| 21    | x298ed82b22 | Float64    | 0       |
| 22    | x29bbd86997 | Float64    | 0       |
| 23    | x2a457d15d9 | Int64      | 0       |
| 24    | x2bc6ab42f7 | Float64    | 0       |
| 25    | x2d7fe4693a | Float64    | 0       |
| 26    | x2e874bc151 | Float64    | 0       |
| 27    | x361f93f4d1 | UTF8String | 0       |
| 28    | x384bec5dd1 | Int64      | 0       |
| 29    | x3df2300fa2 | Float64    | 0       |
| 30    | x3e200bf766 | Int64      | 0       |
| 31    | x3eb53ae932 | Float64    | 0       |
| 32    | x435dec85e2 | Float64    | 0       |
| 33    | x4468394575 | Float64    | 0       |
| 34    | x49756d8e0f | Float64    | 0       |
| 35    | x4fc17427c8 | Float64    | 0       |
| 36    | x55907cc1de | Float64    | 0       |
| 37    | x55cf3f7627 | Float64    | 0       |
| 38    | x56371466d7 | Int64      | 0       |
| 39    | x5b862c0a8f | Int64      | 0       |
| 40    | x5f360995ef | Int64      | 0       |
| 41    | x60ec1426ce | Float64    | 0       |
| 42    | x63bcf89b1d | Float64    | 0       |
| 43    | x6516422788 | Float64    | 0       |
| 44    | x65aed7dc1f | Int64      | 0       |
| 45    | x6db53d265a | Int64      | 0       |
| 46    | x7734c0c22f | Float64    | 0       |
| 47    | x7743f273c2 | Float64    | 0       |
| 48    | x779d13189e | Float64    | 0       |
| 49    | x77b3b41efa | Float64    | 0       |
| 50    | x7841b6a5b1 | Float64    | 0       |
| 51    | x789b5244a9 | Float64    | 0       |
| 52    | x7925993f42 | Float64    | 0       |
| 53    | x7cb7913148 | Int64      | 0       |
| 54    | x7fe6cb4c98 | Float64    | 0       |
| 55    | x8311343404 | Float64    | 0       |
| 56    | x87b982928b | Float64    | 0       |
| 57    | x8a21502326 | Float64    | 0       |
| 58    | x8c2e088a3d | Int64      | 0       |
| 59    | x8d0606b150 | UTF8String | 0       |
| 60    | x8de0382f02 | Float64    | 0       |
| 61    | x8f5f7c556a | Int64      | 0       |
| 62    | x91145d159d | UTF8String | 0       |
| 63    | x96c30c7eef | Float64    | 0       |
| 64    | x96e6f0be58 | Float64    | 0       |
| 65    | x98475257f7 | Float64    | 0       |
| 66    | x99d44111c9 | Float64    | 0       |
| 67    | x9a575e82a4 | Int64      | 0       |
| 68    | x9b6e0b36c2 | Float64    | 0       |
| 69    | a14fd026ce  | Int64      | 0       |
| 70    | a24802caa5  | Float64    | 0       |
| 71    | aa69c802b6  | Float64    | 0       |
| 72    | abca7a848f  | Int64      | 0       |
| 73    | ac826f0013  | Float64    | 0       |
| 74    | ae08d2297e  | Int64      | 0       |
| 75    | aee1e4fc85  | Float64    | 0       |
| 76    | b4112a94a6  | Float64    | 0       |
| 77    | b709f75447  | Float64    | 0       |
| 78    | b835dfe10f  | UTF8String | 0       |
| 79    | b9a487ac3c  | Float64    | 0       |
| 80    | ba54a2a637  | Float64    | 0       |
| 81    | bdf934caa7  | Float64    | 0       |
| 82    | beb6e17af1  | Int64      | 0       |
| 83    | c0c3df65b1  | Int64      | 0       |
| 84    | c1b8ce2354  | Float64    | 0       |
| 85    | c58f611921  | Float64    | 0       |
| 86    | d035af6ffa  | Float64    | 0       |
| 87    | d2c775fa99  | Int64      | 0       |
| 88    | d4d6566f9c  | Float64    | 0       |
| 89    | dcfcbc2ea1  | Int64      | 0       |
| 90    | e0a0772df0  | Float64    | 0       |
| 91    | e16e640635  | UTF8String | 0       |
| 92    | e5efa4d39a  | Float64    | 0       |
| 93    | e7ee22effb  | Int64      | 0       |
| 94    | e86a2190c1  | Int64      | 0       |
| 95    | ea0f4a32e3  | Float64    | 0       |
| 96    | ed7e658a27  | Float64    | 0       |
| 97    | ee2ac696ff  | Float64    | 0       |
| 98    | f013b60e50  | Float64    | 0       |
| 99    | f0a0febd35  | Float64    | 0       |
| 100   | f1f0984934  | UTF8String | 0       |
| 101   | f66b98dd69  | Float64    | 0       |
| 102   | fbf66c8021  | Float64    | 0       |
| 103   | fdf8628ca7  | Int64      | 0       |
| 104   | fe0318e273  | Float64    | 0       |
| 105   | fe8cdd80ba  | Float64    | 0       |
| 106   | ffd1cdcfc1  | Float64    | 0       |
| 107   | id          | Int64      | 0       |
| 108   | target      | Float64    | 0       |

The UTF8Strings are hashed categorical values (probably name, address, e-mail, phone, etc.), I won't worry about them.

What the target looks like julia> train[108] 24976-element DataArrays.DataArray{Float64,1}: 0.536383 0.426536 0.498649 1.01357 4.00876 1.77013 0.895514 0.348059 0.547179 2.38998 3.36091 1.4124 -0.0996988 0.40257 0.896381 1.97511 0.315973 1.35555 5.32482 0.591043 0.610452 1.45285 0.179439 3.16545 ⋮ 7.6247 0.340788 0.205181 2.04826 0.150869 0.775302 0.214532 0.1074 0.46036 0.241159 1.73751 0.390916 3.03904 2.71701 2.94019 0.309083 0.998051 0.762129 0.377102 3.88774 2.39431 0.50462 23.0103 2.53425

I would like to code an RNN to solve this train set so it can do well on a test dataset.

I am also checking to see if echo state nets are better suited.

Kevin

kzahedi commented 8 years ago

I will add you as a contributor. Please make a branch first, in which you make your changed to the code. We will merge the code every now and then.

hpoit commented 8 years ago

Thanks Keyan. I am confirming if the inputs and targets per row will work on it, then will fork it and ask you to pull it, say, to a copied branch of your master perhaps named 'toward JuliaLang'. Once I find the application of this problem I'm solving, I intend to state it and make a notebook of it like the MNIST tutorial. Hopefully many other cases and applications of RNNs will be added here.

Perhaps @pluskid from Mocha.jl could guide us and/or be a co-owner of http://rnnjl.readthedocs.io/en/latest/index.html?

hpoit commented 8 years ago

Hi Keyan! I'm still analyzing the training of an ESN (which you can interpret as a special case of an RNN). I'll post a question here, to which perhaps you might know the answer

The output of an ESN is y(n+1) = f^out [(W^out (u(n+1), x(n+1))] where [...] denotes the concatenated vector made from input (u(n+1)) and internal/hidden (x(n+1)) activation vectors.

From the artificial training dataset below, I have

7 UTF8String categorical variable columns (not using as input) 1 id numerical variable column (not using as anything)

1 target numerical variable column (using as y) 99 numerical variable columns (using as x)

Can I consider all 99 numerical variable columns as inputs? If so, for 99 input units? To be connected with 99 internal units? If so, to be connected to 99 output units? As reference I am using the formal description of an RNN, formulas 1.1-1.7 starting on page 6/46 from Jaeger's ESN tutorial.

I've searched for help on forums and by email, but am having a really hard time finding it (perhaps because as Herbert Jaeger estimates, '95% of publication on neural nets concern feedforward nets'). From my understanding the answer is yes, but I would be more assured with your confirmation.

`1x108 DataFrames.DataFrame

Row x016399044a x023c68873b x0342faceb5 x04e7268385 x06888ceac9 x072b7e8f27 x087235d61e x0b846350ef x0e2ab0831c x12eda2d982
1 6447 "d19e3b17239b50f7055ea4ea09f15e5a" 5372 35812 1 171.464 14 -0.318226 0.207681 76.3025
Row x136c1727c3 x173b6590ae x174825d438 x1f222e3669 x1f3058af83 x1fa099bb01 x20f1afc5c7 x253eb5ef11 x25bbf0e7e7 x2719b72c0d x298ed82b22
1 4.65365 0.966106 614 22.5165 35363 31 7.279 0.820827 66 1.03398e5 0.766197
Row x29bbd86997 x2a457d15d9 x2bc6ab42f7 x2d7fe4693a x2e874bc151 x361f93f4d1 x384bec5dd1 x3df2300fa2 x3e200bf766 x3eb53ae932
1 0.554 94 140.654 1.40322 0.536931 "4c403ff2b026fdea3583064ddca221bd" 1 231.405 37936 2.21241
Row x435dec85e2 x4468394575 x49756d8e0f x4fc17427c8 x55907cc1de x55cf3f7627 x56371466d7 x5b862c0a8f x5f360995ef x60ec1426ce x63bcf89b1d
1 0.156108 9.21599 0.872587 -0.35723 0.743164 1.00093 13821 5107 481 0.591463 0.357978
Row x6516422788 x65aed7dc1f x6db53d265a x7734c0c22f x7743f273c2 x779d13189e x77b3b41efa x7841b6a5b1 x789b5244a9 x7925993f42 x7cb7913148
1 256.309 19 2238 106.943 9.95844 0.0408489 -59.4766 1788.05 0.758338 11.5362 0
Row x7fe6cb4c98 x8311343404 x87b982928b x8a21502326 x8c2e088a3d x8d0606b150 x8de0382f02 x8f5f7c556a
1 9.98577 7.51776 0.868664 1281.12 47477 "2912e57d5df15a32301c3440c0b2d326" 28859.3 13
Row x91145d159d x96c30c7eef x96e6f0be58 x98475257f7 x99d44111c9 x9a575e82a4 x9b6e0b36c2 a14fd026ce a24802caa5 aa69c802b6
1 "6220f63c720e759cebac416cd296db92" 0.735624 2.99288 1.35842 42252.3 1 2.41066 8 -3.18487 -1.24541
Row abca7a848f ac826f0013 ae08d2297e aee1e4fc85 b4112a94a6 b709f75447 b835dfe10f b9a487ac3c ba54a2a637 bdf934caa7
1 39713 4.30032 41166 0.857421 0.42577 -0.341316 "a5f9a2e7ace0c54439d5905c0447241e" 0.653498 0.446879 1.05966
Row beb6e17af1 c0c3df65b1 c1b8ce2354 c58f611921 d035af6ffa d2c775fa99 d4d6566f9c dcfcbc2ea1 e0a0772df0 e16e640635
1 12684 33489 6.58226e7 0.33991 1576.54 2996 3.4575e5 39365 0.41533 "c2f8ba948306c79f2c21b3c93408c6c5"
Row e5efa4d39a e7ee22effb e86a2190c1 ea0f4a32e3 ed7e658a27 ee2ac696ff f013b60e50 f0a0febd35 f1f0984934 f66b98dd69
1 24454.5 65 363 0.80537 11953.5 1.00961 0.970187 -5.99427 "abe90d9881c0790f47e139cf3e915f2f" 0.95833
Row fbf66c8021 fdf8628ca7 fe0318e273 fe8cdd80ba ffd1cdcfc1 id target
1 91.7786 1058 1.0005 40.2978 1.74053 0 0.536383
kzahedi commented 8 years ago

ESN = Echo State Network?

hpoit commented 8 years ago

Correct. I'm sorry.

kzahedi commented 8 years ago

Coincidently, I used to work in Herbert Järgers Group, when he discovered ESNs :) But that was some time ago.

The idea is that you have a recurrent structure, with a random connectivity. I believe there are restrictions on the weights of the recurrent part of the network. If I am not mistaken, the weights must be positive and smaller than one. You then have to additional layers, one input and one output layer. I don't remember how the input layers was connected, but there is a ton of literature out there. You could google for Jochen Triesch (Frankfurt) who has used them, or Oliver Obst (Sydney) who used them extensively. The magic of ESN is that only the weights of the reservoir to the output layer need to be trained, by something as simple as Backprob.

I hope that helps.

hpoit commented 8 years ago

That is exciting! I remade the question while you replied, and you made reference exactly to where I'm stuck: given 99 input types, how many neurons for each layer? I will search Jochen and Oliver, thank you! It makes me even happier that you have a foot on Julia!