Closed kunliu916 closed 3 months ago
Isn't the 2.16 version default to CPU?
I am extremely slow when training
From what I know package tensorflow_gpu doesn't anymore exists on it's own - you can directly use tensorflow latest version with GPU if your hardware (nvidia GPU) and OS permits it - or for AMD GPU use tensorflow-rocm - but I had too much issues in the past with it so I haven't even tested with amd ! That said I only runs my codes in a tensorflow latest docker image and didn't had much issues, training is not lightning fast as for RNN but not particularly slow neither ! You can compare with the step times in the example notebook, was done using rtx3070 probably
The TKAN layers supports GPU for sure, you can use command nvidia-smi to see it in command line while running a model fit.
To give a comparison points: model = Sequential([ Input(shape=X_train.shape[1:]), TKAN(100, return_sequences=True), TKAN(100, return_sequences=False), Dense(units=y_train.shape[1], activation='linear') ]) take 7 seconds per epochs while model = Sequential([ Input(shape=X_train.shape[1:]), LSTM(100, return_sequences=True), LSTM(100, return_sequences=False), Dense(units=y_train.shape[1], activation='linear') ]) take 3 seconds on the same exacts datas.
However there is a TKAN layers parameters that have impacts on training times (that makes the example version slower to run for example), which are tkan_activations, as the TKAN layers accepts any number of sub GRKAN layers, increasing the number of sub GRKAN layers has a quite important impact, for example: model = Sequential([ Input(shape=X_train.shape[1:]), TKAN(100, tkan_activations=[{'grid_size': 3} for i in range(5)], return_sequences=True), TKAN(100, tkan_activations=[{'grid_size': 3} for i in range(5)], return_sequences=False), Dense(units=y_train.shape[1], activation='linear') ]) will take nearly 19 seconds as there is 5 sub GRKAN layers ! (default only use one GRKAN)
Does This not support GPU?