karpathy / convnetjs

Deep Learning in Javascript. Train Convolutional Neural Networks (or ordinary ones) in your browser.
MIT License
10.9k stars 2.04k forks source link

Is there a thorough explanation of the inner workings of convnet.js & deepqlearn.js? #42

Open marcoippolito opened 9 years ago

marcoippolito commented 9 years ago

Hi Andrej, Is there a thorough explanation of the inner workings of convnet.js & deepqlearn.js? I have some question marks to clarify

Marco

karpathy commented 9 years ago

I'm afraid not, beyond what's in the code. Or can ask here

EDIT: also some docs on main convnetjs.com page

marcoippolito commented 9 years ago

Ok. I'm going to sum up all my skimmed doubts in a list (hopefully not that long). (I've been reading https://github.com/cs231n/cs231n.github.io/blob/master/convolutional-networks.md and some doubts were already solved). Thanks for support Andrej. Marco

marcoippolito commented 9 years ago

Hi Andrej, my objective is to use the Deep Reinforcement Learning approach and algorithms for intelligent web crawling, as (actually briefly) showed by the paper "Learning to Surface Deep Web Content"-Zhaohui Wu, Lu Jiang, Qinghua Zheng, Jun Liu: www.aaai.org/ocs/index.php/AAAI/AAAI10/paper/viewFile/1652/2326

I put in a,starting (please forgive me if in future I will have few other doubts to clarify), list some questions I have, to properly understand the inner workings of your great work, in order use and adapt it to what I have to do:

1) https://github.com/karpathy/convnetjs/blob/master/src/convnet_vol.js#L51
"var ix=((this.sx * y)+x)_this.depth+d": since var Vol = function(sx, sy, depth, c) and var n = sx_sy*depth sx: dimension of width sy: dimension of height

I do not understand why to add +x, and then add "d" (after multiplying for this.depth) Is it related to x = x + step_size * x_derivative ( http://karpathy.github.io/neuralnets/) ?

2) https://github.com/karpathy/convnetjs/blob/master/src/convnet_layers_dotproducts.js#L70 a += f.w[((f.sx * fy)+fx)_f.depth+fd] * V.w[((V_sx * oy)+ox)_V.depth+fd] : I guess the dot product between the entries of the filter and the input is done through f.w[...] * V.w[...] : am I right or completely out of track?

3) https://github.com/karpathy/convnetjs/blob/master/src/convnet_vol_util.js#L38 : I do not understand W.get(W.sx - x - 1,y,d): could you please give my some hints about -x -1 ? 4) https://github.com/karpathy/convnetjs/blob/master/src/convnet_layers_dropout.js#L36 : for(var i=0;i<N;i++) { V2.w[i]*=this.drop_prob; } I do not understand the rationale: why multiplying the V2.w[i] value for the drop_probability, which is 0.5 as default (if not specified) ?

5) https://github.com/karpathy/convnetjs/blob/master/src/convnet_layers_normalization.js#L76 : var g = -aj_this.beta_Math.pow(S,this.beta-1)_this.alpha/this.n_2aj; var SB = Math.pow(S, this.beta); if(j===i) g+= SB; g /= SB2; g = chain_grad; V.add_grad(x,y,j,g);

I do not understand the rationale behind these computations. Would you please give me an hint?

6)https://github.com/karpathy/convnetjs/blob/master/src/convnet_trainers.js#L92 : var dx = - this.learning_rate * biasCorr1 / (Math.sqrt(biasCorr2) + this.eps); I found a paper called "Adam: a method for stochastic optimization-Diederik P. Kingma, Jimmy Lea" but I didn't find the above formula. The same for "nesterov". Would you pleas give me references for "adam" and "nesterov"?

7) https://github.com/karpathy/convnetjs/blob/master/build/deepqlearn.js#L61

this.net_inputs = num_states * this.temporal_window + num_actions * this.temporal_window + num_states; Why adding num_states again, after multiplying it by this.temporal_window ?

8) https://github.com/karpathy/convnetjs/blob/master/build/deepqlearn.js#L166 : for(var k=0;k<this.temporal_window;k++) { w = w.concat(this.state_window[n-1-k]); action1ofk[this.action_window[n-1-k]] = 1.0*this.num_states; w = w.concat(action1ofk); } I do not grasp the size of the window : why this.action_window[n-1-k] ? : Why the actual size of the window is shrinked starting from [k-1] position and not just from [k] position?

9) https://github.com/karpathy/convnetjs/blob/master/build/deepqlearn.js#L183 : this.epsilon = Math.min(1.0, Math.max(this.epsilon_min, 1.0-(this.age - this.learning_steps_burnin)/(this.learning_steps_total - this.learning_steps_burnin))); Can you please give me a reference (a paper, a book) where to find how to calculate epsilon?

10) https://github.com/karpathy/convnetjs/blob/master/build/deepqlearn.js#L229 : e.state0 = this.net_window[n-2]; again it's not clear to me the window size: could you please explain me why [n-2] ?

11) As anticipated, my objective is to apply deep reinforcement learning to web crawling, creating a sort of "adaptative crawling": the crawling agent learns a promising crawling strategy from its own experience, and, from diverse features of query keywords.

I still have to figure out how to devise the whole mechanism, but I would like to ask you what could be, according to your experience, the key aspects and elements to consider, in order to use and properly adapt your deep reinforcement learning implementation to successfully accomplish an adaptative "intelligent" web crawling engine.

Looking forward to your kind help (I'm eager to learn as much as possible from you and, possibly, from other experts like you). Marco

karpathy commented 9 years ago
  1. those two x are not same and have nothing to do with each other
  2. yes
  3. it's a loop going backwards I assume
  4. see cs231n notes under Inverse Dropout
  5. that's backward pass for a neormalization layer. gradient computation
  6. that adam formula is some kind of simplified version of Adam that works similarly, I think. Isn't my code 7,8. can't remember
  7. no reference, it's just decayed slowly from 1 to a value of 0.1
  8. you'll have to step through these carefully. I can't remember

It doesn't seem to me that DQN is a good match for the problem you're describing, and especially given that you do not seem to have a strong background of applying neural networks in general. DQN is for now an advanced kind of algorithm and isn't a plug and play thing.

marcoippolito commented 9 years ago

Hi Andrej, thank you very much for kindly answering most of my questions.

Sorry for candidly asking you again, but I still have left 2 doubts from my previous long list. 1) https://github.com/karpathy/convnetjs/blob/master/src/convnet_vol.js#L51 "var ix=((this.sx * y)+x)this.depth+d": since var Vol = function(sx, sy, depth, c) and var n = sxsydepth sx: dimension of width sy: dimension of height Having confirmed my hypothesis that the equation has nothing to do with x = x + step_size * x_derivative, I still have doubts about why adding x and then d: That's because I would have done this in order to get the "infinitesimal" element: var ix = ((this.sx * y) * this.depth , without adding x and then d instead of : "var ix=((this.sx \ y)+x)this.depth+d"

2) https://github.com/karpathy/convnetjs/blob/master/build/deepqlearn.js#L61 this.net_inputs = num_states * this.temporal_window + num_actions * this.temporal_window + num_states I still do not understand why to add num_states again, after multiplying it by this.temporal_window

I thank you in advance for your kind further explanations, in order to learn, with an humble but confident approach, from you. Marco

soswow commented 7 years ago

I see this thread is quite old, but who knows maybe it be useful to someone. Read "Playing Atari with Deep Reinforcement Learning" first. deeplearn.js is basically impl. of what is written there. You might understand better what's going on with 2. in particular. Here is my short explanation. You have three things: State of your system, Action taken upon some state and NN with input. Input to NN is a combination of current state (that you want to find action for) + sequence of previous states and actions that have been taken. In this context this sequence of previous actions called temporal memory or temporal window. So said that you have your network input to be temporal_window times pair of (state size + actions) + current state.