vanduc103 / air_prediction

Air Pollution Interpolation and Prediction
6 stars 0 forks source link

Inquiry about source code air_prediction/seq2seq_forecast.py #1

Closed WonmoKoo closed 2 years ago

WonmoKoo commented 2 years ago

It seems to me that this code is for the model of the article "Spatiotemporal Deep Learning Model for Citywide Air Pollution Interpolation (2020, BigComp)" (If not, please tell me)

From line 162 of your code, I have questions.

  1. In lines 164-165, you used "outputs (in line 136)" of the encoder as an input to the decoder (input_decov in line 165). However, as I understood, in the original ConvLSTM paper (Shi et al., 2015), their "encoding-forecasting structure" was similar to "the unconditional future predictor model" in Srivastava et al., (2015)", so there is no input to the decoder (forecasting network).

Q1. I wonder if your encoding-forecasting structure is different from that of the original paper, or if there is anything I misunderstood.

2. In line 168, you used only the last hidden state of the decov_cell3 to predict air pollution in the future time stamps. To generate "output (in line 190)", you employed a fully connected layer, a dropout layer, and an output layer. The size of three layers are 8192 (grid_size out_channel[2]), 1000 (fc_size), and 24576 (output_size, grid_sizepred_timesteps).

Q2. In your paper, you mentioned as follows,

"The output of the forecasting network is then fed into a 1x1 convolution layer to produce the final output. 1x1 convolution is called a “feature pooling” technique where it allows to sum pooling the features across the depth channel while still keeps the spatial characteristic of the feature map."

Then, when did you apply the 1x1 convolution layer in this code?

It would be very helpful if you reply.

Thanks

[Reference] Xingjian, S. H. I., et al. "Convolutional LSTM network: A machine learning approach for precipitation nowcasting." Advances in neural information processing systems. 2015.

Srivastava, Nitish, Elman Mansimov, and Ruslan Salakhudinov. "Unsupervised learning of video representations using lstms." International conference on machine learning. PMLR, 2015.

vanduc103 commented 2 years ago

Hello, thank you for interesting in my paper and sorry for my delay response. For Q1: As you can revisit the ConvLSTM paper (Shi et al.) "The encoding LSTM compresses the whole input sequence into a hidden state tensor and the forecasting LSTM unfolds this hidden state to give the final prediction", we still need an input to the decoder network (or forecasting network). In my case, I used the output of the encoder network, reshape it to feed as input to the decoder network. For Q2: These are how I apply 1x1 conv to the output of decoder network:

1x1 convolutional

conv_input = tf.reshape(states[-1], [-1, image_size, image_size, out_channel[0]])
W_output = tf.get_variable(name='W_output', shape=[1, 1, out_channel[0], pred_timesteps],   
            initializer=tf.contrib.layers.xavier_initializer())
b_output = tf.Variable(tf.zeros(pred_timesteps))
output = tf.nn.sigmoid(tf.nn.conv2d(conv_input, W_output, strides=[1,1,1,1], padding='SAME') + b_output)
output = tf.reshape(output, [-1, output_size])

Best!

WonmoKoo commented 2 years ago

Thank you for your kind reply!

However, I want a more specified answer to Q2.

I cannot find the 1*1 convolutional layer that you mentioned in your uploaded code. Instead, in the code, a fully connected layer, a dropout layer, and an output layer (fully connected layer) were applied to the output of the decoder network (see lines 167-188 in seq2seq_forecast.py)

Q1. Please check if the uploaded code is the final version.

Besides, in your paper page 59, you said "If not explicitly stated, all experiments in this paper use the learning rate of 0.001, the batch size 128, training steps 200, L2 regularization with beta value 0.01 and the dropout ratio 0.5".

Q2. I would like to ask how the dropout layer was applied when the 1*1 convolution layer was applied.

Thank you!

vanduc103 commented 2 years ago

Hello, For 1x1 conv layer, you can check again my code piece: W_output = tf.get_variable(name='W_output', shape=[1, 1, out_channel[0], pred_timesteps] => this is for 1x1 kernel tf.nn.conv2d(conv_input, W_output, strides=[1,1,1,1], padding='SAME') => this is how 1x1 conv was applied (P/S: I have checked again the code, it only had the 1x1 conv layer applied in interpolation part but in the forecasting part it still used the fully connected layer. Thank you for pointing out! I will update the final code later but you can use my above code to implement yours)

For dropout layer, you can check the file 'seq2seq_forecast_all.py'. I applied the dropout in the output layer. Notice that I did not use the dropout in all layers but if I used, I would use the dropout ratio as 0.5. That what I mean in my paper. Thank you!

WonmoKoo commented 2 years ago

Thank you!