Open MadanMl opened 3 years ago
In the code, the convolution operations is applied for the future time stamp also during the training. Example: for the second iteration (t=1), the stack of three time steps (0,1,2) is used for local context extractor (i.e. the convolutional block with gating unit). But doesn't the gated convolution be limited to the current and previous position only (i.e. only applying convolution on 0,1 stacked vector)
Hi Madan (if your name)! Many thanks for bringing this interesting point. I think it could be a potential problem for the functioning. Actually, when developing the model, I also tried taking only the previous states as the inputs for feature extraction. The difference is not so obvious. If we take the time step 2 for t = 1, this can also work because during testing stage, we can perform prediction until tmax - 1, instead of tmax. I tried this way because I was thinking this could better extract the local features. But this can be simply modified by yourself as well for sure :)
Yes, thank you Cheng for your reply. When you tried taking only 2 blocks at a time i.e. time step 2 for t = 1 did you get better or worse results?
Could you share the hyperparamters with which we can reporduce the results menstioned in the paper (Remaining useful life estimation via transformer encoder enhanced by a gated convolutional unit. Journal of Intelligent Manufacturing, 1-10.).
Yes, thank you Cheng for your reply. When you tried taking only 2 blocks at a time i.e. time step 2 for t = 1 did you get better or worse results?
Could you share the hyperparamters with which we can reporduce the results menstioned in the paper (Remaining useful life estimation via transformer encoder enhanced by a gated convolutional unit. Journal of Intelligent Manufacturing, 1-10.).
Hi Madan! please be noted that I'm not the author of the paper hahaha. I'm also trying to reproduce the experimental results they covered in the paper. But anyway, to some extent, the current architecture could be the best I could achieve by far (except that I have some own innovations to improve it but not made public). I guess if you are interested the changes you mentioned can be simply made and tested by yourself as well.
btw, please feel free to collaborate on this work if you have any interests. Thank you!
Hi Mr. Cheng, (Hi Madan! please be noted that I'm not the author of the paper hahaha) :) then I will try to reproduce the results. (btw, please feel free to collaborate on this work if you have any interests. Thank you!) Perfect, I will do that and also I will close this issue. Thank you
Hi madan sir were you able to reproduce the paper and if yes can you share the hyperparmeter and thanks in advance sir
In the code, the convolution operations is applied for the future time stamp also during the training. Example: for the second iteration (t=1), the stack of three time steps (0,1,2) is used for local context extractor (i.e. the convolutional block with gating unit). But doesn't the gated convolution be limited to the current and previous position only (i.e. only applying convolution on 0,1 stacked vector)