patrick-kidger / NeuralCDE

Code for "Neural Controlled Differential Equations for Irregular Time Series" (Neurips 2020 Spotlight)
Apache License 2.0
609 stars 67 forks source link

Low performance on various GPUs #10

Closed dungxibo123 closed 2 years ago

dungxibo123 commented 2 years ago

Dear Professor,

I'm a third-year student who is interested in your works. I'm starting with Neural Control Differential Equations for Irregular Times Series Data.

I have a question and I hope you have time to respond to me.

I wonder that what GPUs you had used for your experiments and how many days that you have to wait to achieve the result in the paper. I tried to run your work by Python with some updates for compatibility with the new version of libraries. After few days, some methods to preprocess the UEA datasets were deprecated. I start with the Speech Commands datasets. I had run code on 3 types of GPUs.

1/ 1x 1050Ti 4GB. (70s-90s / 1 epoch) 2/ 1x A100 40GB. (120s - 144s / 1 epoch) 3/ 1x 3090Ti 12GB. (40s - 50s / 1 epoch)

With the above GPUs, I received quite bad results. For more specific. It was around 0.57 ~ 0.58 accuracy on the validation set (and also the train test, too) for the first time. The second time, I run on A100. and after 70 epochs of the first run, the validation was returned is 0.213. I just want to know what your GPUs are, when you run the code, maybe I was wrong somewhere. image

Best regards, Tien Dung

patrick-kidger commented 2 years ago

Hi there. (In passing I'd note that I'm not a professor!)

So the GPUs we uses were some mid-level ones -- to quote from the paper, two GeForce RTX 2080 Ti, and two Quadro GP100. I doubt that the GPU choice is going to be responsible for poor performance though.

On the topic of model performance -- I am (obviously) quite surprised by the poor results you seem to be getting. IIRC with Speech Commands you should get to about 85% accuracy after just a few epochs, and then the rest of the training time slowly improves things a little further.

Are you definitely running the neural CDE models, and not one of the benchmark-for-comparison RNN models? The RNN models were all flaky on this dataset: sometimes they would produce excellent results, sometimes they would produce awful results, and it differed from training run to training run. That aside, it's also plausible that a change in software library somewhere has quietly broken something.

As a first place to start, I'd recommend trying the Speech Commands example from this repository. This was a follow-up paper that happened later, with a new codebase. (And substantially tidier code.) Getting some more data might help diagnose the issue.

dungxibo123 commented 2 years ago

Many thanks for your response.

I will try using the new code in your recommendation. In order to reply above questions. I tried to remove all other models. I just run NeuralCDE model. And, due to datasets/speech_commands.py does not contain any deprecated methods so I have not changed anything. But I think some small changes have broken something.

Again, thanks for your advices.

dungxibo123 commented 2 years ago

I'm sorry madam, In case someone falls into my fault, I will leave some comments for them.

Please make sure that torchaudio.load goes with normalize=False (default will be True), I have erased this term when process data. If you want to train with fewer classes, please make sure, X and Y variables was generate as same size as batch_index.

Thanks.