Open marcoippolito opened 9 years ago
I'm afraid not, beyond what's in the code. Or can ask here
EDIT: also some docs on main convnetjs.com page
Ok. I'm going to sum up all my skimmed doubts in a list (hopefully not that long). (I've been reading https://github.com/cs231n/cs231n.github.io/blob/master/convolutional-networks.md and some doubts were already solved). Thanks for support Andrej. Marco
Hi Andrej, my objective is to use the Deep Reinforcement Learning approach and algorithms for intelligent web crawling, as (actually briefly) showed by the paper "Learning to Surface Deep Web Content"-Zhaohui Wu, Lu Jiang, Qinghua Zheng, Jun Liu: www.aaai.org/ocs/index.php/AAAI/AAAI10/paper/viewFile/1652/2326
I put in a,starting (please forgive me if in future I will have few other doubts to clarify), list some questions I have, to properly understand the inner workings of your great work, in order use and adapt it to what I have to do:
1) https://github.com/karpathy/convnetjs/blob/master/src/convnet_vol.js#L51
"var ix=((this.sx * y)+x)_this.depth+d":
since var Vol = function(sx, sy, depth, c) and var n = sx_sy*depth
sx: dimension of width
sy: dimension of height
I do not understand why to add +x, and then add "d" (after multiplying for this.depth) Is it related to x = x + step_size * x_derivative ( http://karpathy.github.io/neuralnets/) ?
2) https://github.com/karpathy/convnetjs/blob/master/src/convnet_layers_dotproducts.js#L70 a += f.w[((f.sx * fy)+fx)_f.depth+fd] * V.w[((V_sx * oy)+ox)_V.depth+fd] : I guess the dot product between the entries of the filter and the input is done through f.w[...] * V.w[...] : am I right or completely out of track?
3) https://github.com/karpathy/convnetjs/blob/master/src/convnet_vol_util.js#L38 : I do not understand W.get(W.sx - x - 1,y,d): could you please give my some hints about -x -1 ? 4) https://github.com/karpathy/convnetjs/blob/master/src/convnet_layers_dropout.js#L36 : for(var i=0;i<N;i++) { V2.w[i]*=this.drop_prob; } I do not understand the rationale: why multiplying the V2.w[i] value for the drop_probability, which is 0.5 as default (if not specified) ?
5) https://github.com/karpathy/convnetjs/blob/master/src/convnet_layers_normalization.js#L76 : var g = -aj_this.beta_Math.pow(S,this.beta-1)_this.alpha/this.n_2aj; var SB = Math.pow(S, this.beta); if(j===i) g+= SB; g /= SB2; g = chain_grad; V.add_grad(x,y,j,g);
I do not understand the rationale behind these computations. Would you please give me an hint?
6)https://github.com/karpathy/convnetjs/blob/master/src/convnet_trainers.js#L92 : var dx = - this.learning_rate * biasCorr1 / (Math.sqrt(biasCorr2) + this.eps); I found a paper called "Adam: a method for stochastic optimization-Diederik P. Kingma, Jimmy Lea" but I didn't find the above formula. The same for "nesterov". Would you pleas give me references for "adam" and "nesterov"?
7) https://github.com/karpathy/convnetjs/blob/master/build/deepqlearn.js#L61
this.net_inputs = num_states * this.temporal_window + num_actions * this.temporal_window + num_states; Why adding num_states again, after multiplying it by this.temporal_window ?
8) https://github.com/karpathy/convnetjs/blob/master/build/deepqlearn.js#L166 : for(var k=0;k<this.temporal_window;k++) { w = w.concat(this.state_window[n-1-k]); action1ofk[this.action_window[n-1-k]] = 1.0*this.num_states; w = w.concat(action1ofk); } I do not grasp the size of the window : why this.action_window[n-1-k] ? : Why the actual size of the window is shrinked starting from [k-1] position and not just from [k] position?
9) https://github.com/karpathy/convnetjs/blob/master/build/deepqlearn.js#L183 : this.epsilon = Math.min(1.0, Math.max(this.epsilon_min, 1.0-(this.age - this.learning_steps_burnin)/(this.learning_steps_total - this.learning_steps_burnin))); Can you please give me a reference (a paper, a book) where to find how to calculate epsilon?
10) https://github.com/karpathy/convnetjs/blob/master/build/deepqlearn.js#L229 : e.state0 = this.net_window[n-2]; again it's not clear to me the window size: could you please explain me why [n-2] ?
11) As anticipated, my objective is to apply deep reinforcement learning to web crawling, creating a sort of "adaptative crawling": the crawling agent learns a promising crawling strategy from its own experience, and, from diverse features of query keywords.
I still have to figure out how to devise the whole mechanism, but I would like to ask you what could be, according to your experience, the key aspects and elements to consider, in order to use and properly adapt your deep reinforcement learning implementation to successfully accomplish an adaptative "intelligent" web crawling engine.
Looking forward to your kind help (I'm eager to learn as much as possible from you and, possibly, from other experts like you). Marco
It doesn't seem to me that DQN is a good match for the problem you're describing, and especially given that you do not seem to have a strong background of applying neural networks in general. DQN is for now an advanced kind of algorithm and isn't a plug and play thing.
Hi Andrej, thank you very much for kindly answering most of my questions.
Sorry for candidly asking you again, but I still have left 2 doubts from my previous long list. 1) https://github.com/karpathy/convnetjs/blob/master/src/convnet_vol.js#L51 "var ix=((this.sx * y)+x)this.depth+d": since var Vol = function(sx, sy, depth, c) and var n = sxsydepth sx: dimension of width sy: dimension of height Having confirmed my hypothesis that the equation has nothing to do with x = x + step_size * x_derivative, I still have doubts about why adding x and then d: That's because I would have done this in order to get the "infinitesimal" element: var ix = ((this.sx * y) * this.depth , without adding x and then d instead of : "var ix=((this.sx \ y)+x)this.depth+d"
2) https://github.com/karpathy/convnetjs/blob/master/build/deepqlearn.js#L61 this.net_inputs = num_states * this.temporal_window + num_actions * this.temporal_window + num_states I still do not understand why to add num_states again, after multiplying it by this.temporal_window
I thank you in advance for your kind further explanations, in order to learn, with an humble but confident approach, from you. Marco
I see this thread is quite old, but who knows maybe it be useful to someone. Read "Playing Atari with Deep Reinforcement Learning" first. deeplearn.js is basically impl. of what is written there. You might understand better what's going on with 2. in particular. Here is my short explanation. You have three things: State of your system, Action taken upon some state and NN with input. Input to NN is a combination of current state (that you want to find action for) + sequence of previous states and actions that have been taken. In this context this sequence of previous actions called temporal memory or temporal window. So said that you have your network input to be temporal_window
times pair of (state size + actions) + current state.
Hi Andrej, Is there a thorough explanation of the inner workings of convnet.js & deepqlearn.js? I have some question marks to clarify
Marco