The first kernel launch of fft_conv_fwd takes abnormally long time (about 100 sec).
After the first kernel launch it works fine so for training it's not much of an issue but it makes debugging very cumbersome.
Could it be a problem with my ld options?
The first kernel launch of fft_conv_fwd takes abnormally long time (about 100 sec). After the first kernel launch it works fine so for training it's not much of an issue but it makes debugging very cumbersome. Could it be a problem with my ld options?