openclimatefix / skillful_nowcasting

Implementation of DeepMind's Deep Generative Model of Radar (DGMR) https://arxiv.org/abs/2104.00954
MIT License
215 stars 59 forks source link

Training on other dataset + Error on using run.py #33

Open ZHANGZ1YUE opened 2 years ago

ZHANGZ1YUE commented 2 years ago

Describe the bug Hi! Thank you very much for providing this implementation of dgmr. The model and blocks look very organized and straightforward! However, I have encountered some issues running your code, mostly due to the complexity of the"run.py" code as it is very complicated to understand the logic (Most likely due to the fact that I do not understand how the dataset looks like). 1: Could you please explain a little bit about how you preprocess the dataset? I hope to run the model on my own dataset so I need to prepare it such that it matches the way you preprocess it. (By the way, if I want to visualize any of the data frames of rainfall, what should I do?) 2: I have encountered error when using run.py. The problem is exactly the same with the following issue (https://github.com/openclimatefix/skillful_nowcasting/issues/32#issue-1312789322). I have changed the number of GPU to 1, and the problem still remains. I could not find a solution from the previous issue as the conversation looks a bit confusing. Do I have to manually download something from GCP bucket on my machine? If that's the case, what shall I download and how should I use it?

To Reproduce python run.py

Expected behavior Error same with https://github.com/openclimatefix/skillful_nowcasting/issues/32#issue-1312789322 pumps out

Thank you again for your great work.

jacobbieker commented 2 years ago

Hi,

  1. Its preprocessed the same as in the DGMR code and paper, as in the preprocessing is copy and pasted from their open sourced code, with the only changes being those needed to turn it into PyTorch format. So I would refer to the paper for that.
  2. What is the error exactly? If its the first error in that issue, it seems that the dataset script cannot download the data from GCP for some reason. They originally fixed that by moving to a GPU machine and it was fixed, somehow? But if you are planning on using the model with your own dataset, then you can ignore this, and just swap out the dataset loader in run.py with your dataset loader and the error should go away.
ZHANGZ1YUE commented 2 years ago

Thank you for the quick response!

1: I have understood and will look into their code again. 2: It is indeed the first error in that issue. 123

Before running my own dataset, I hope to see something with the provided dataset from the original code. I am using a Nvidia GPU machine with cuda available. But the error persists. I did not change anything (except gpu from 6 -> 1) but using python run.py directly

jacobbieker commented 2 years ago

Ah okay, I probably won't have much time for the next month or so to add this, but if you are familiar with HuggingFace datasets, I've converted and uploaded the validation and test sets of the full dataset to here: https://huggingface.co/datasets/openclimatefix/nimrod-uk-1km-validation and https://huggingface.co/datasets/openclimatefix/nimrod-uk-1km-test the training set is a lot larger, so has been taking a lot longer. But you should be able to use those and possibly train on the validation set to see how well it works for you?

peterdudfield commented 2 years ago

@all-contributors please add @jacobbieker for code

allcontributors[bot] commented 2 years ago

@peterdudfield

I've put up a pull request to add @jacobbieker! :tada:

peterdudfield commented 2 years ago

@all-contributors please add @ZHANGZ1YUE for bug

allcontributors[bot] commented 2 years ago

@peterdudfield

I've put up a pull request to add @ZHANGZ1YUE! :tada:

ZHANGZ1YUE commented 2 years ago

Thank you very much for providing new information to the dataset! I will be checking on that later when I got time, and Im currently building my own data class with your model code. (Just to say, the code for the model structure is brilliant!)