gyla1993 / LightNet

LightNet: A Dual Spatiotemporal Encoder Network Model for Lightning Prediction In KDD 2019
17 stars 5 forks source link

Datasets #1

Open Asishinuzuka opened 4 years ago

Asishinuzuka commented 4 years ago

Can you please tell me from where to get the data to run your project? Thank you.

gyla1993 commented 4 years ago

Can you please tell me from where to get the data to run your project? Thank you.

Sorry for the very very late reply. Unfortunately, the data set is temporarily unavailable due to a confidentiality agreement with the government. But we are actively communicating with them hoping to provide a publicly available version of the data set. Once successful, we will upload them to this site. Thanks for your interest in our work.

YMui9527 commented 3 weeks ago

Hello, how did you preprocess the dataset?

gyla1993 commented 3 weeks ago

Hello, how did you preprocess the dataset?

For the lightning data, we processed it into a binary 0/1 grid, where each grid cell indicates whether a lightning event occurred within a specific time period. For the WRF data, we applied Z-score normalization to different variables across different height layers.

YMui9527 commented 3 weeks ago

Thank you very much for your answer, can you explain the WRF data processing in detail, I tried to use LightNet+ with 23 years of data, and the indicator was 0. Therefore, when I was looking for similar models on GitHub, I saw your model, and I also read your article, which felt very meaningful to my current work.

Hello, how did you preprocess the dataset?

For the lightning data, we processed it into a binary 0/1 grid, where each grid cell indicates whether a lightning event occurred within a specific time period. For the WRF data, we applied Z-score normalization to different variables across different height layers.

gyla1993 commented 3 weeks ago

Thank you very much for your answer, can you explain the WRF data processing in detail, I tried to use LightNet+ with 23 years of data, and the indicator was 0. Therefore, when I was looking for similar models on GitHub, I saw your model, and I also read your article, which felt very meaningful to my current work.

Hello, how did you preprocess the dataset?

For the lightning data, we processed it into a binary 0/1 grid, where each grid cell indicates whether a lightning event occurred within a specific time period. For the WRF data, we applied Z-score normalization to different variables across different height layers.

Thank you for your interest in our work. Assuming the WRF data consists of h height levels, with s variables at each level, we first compute the mean and standard deviation for each variable at each height level from the WRF training data. This results in h * s mean and standard deviation values. Before feeding the WRF data into the model, we normalize each variable at every height level by subtracting its corresponding mean and dividing by its standard deviation. Additionally, it is important to note that you may need to filter out cases where lightning events are too sparse. For example, you might consider removing samples where fewer than a certain number of lightning events occur within a 6-hour window. We applied similar steps in both LightNet and LightNet+. I hope this helps with your work!

YMui9527 commented 3 weeks ago

Thank you for your answer, I created a dataset with similar dimensions based on the data you provided in the LightnetPlus project, but I still have some questions about the composition of the WRF data. The first is the dimensionality of the WRF data: while looking at the WRF data format processed in your open source code, I noticed that the dimensions of the data are (18, 9, 159, 159). My understanding is that the last two dimensions correspond to horizontal grid points, but the first two dimensions are not quite sure, do they represent time and vertical levels respectively? Can you confirm if my understanding is correct? Second, regarding the naming of the WRF data file, such as the file name QGRAUP_ave3.npy, does' ave3 'refer to the 3-hour average, or does it have some other specific meaning? If it's convenient, could you provide me with some code snippets for handling WRF data? I hope to deepen my understanding of the relevant logic and implementation by reading the code, thank you very much for your help!

Thank you very much for your answer, can you explain the WRF data processing in detail, I tried to use LightNet+ with 23 years of data, and the indicator was 0. Therefore, when I was looking for similar models on GitHub, I saw your model, and I also read your article, which felt very meaningful to my current work.

Hello, how did you preprocess the dataset?

For the lightning data, we processed it into a binary 0/1 grid, where each grid cell indicates whether a lightning event occurred within a specific time period. For the WRF data, we applied Z-score normalization to different variables across different height layers.

Thank you for your interest in our work. Assuming the WRF data consists of h height levels, with s variables at each level, we first compute the mean and standard deviation for each variable at each height level from the WRF training data. This results in h * s mean and standard deviation values. Before feeding the WRF data into the model, we normalize each variable at every height level by subtracting its corresponding mean and dividing by its standard deviation. Additionally, it is important to note that you may need to filter out cases where lightning events are too sparse. For example, you might consider removing samples where fewer than a certain number of lightning events occur within a 6-hour window. We applied similar steps in both LightNet and LightNet+. I hope this helps with your work!