Open huangzy-55 opened 1 year ago
I am trying to train DisCo, but it costs the cuda memory too much especially when I increase the number of positive and negative number (N and K). And it seems that declining BATCHSIZE and distributing it cannot help.
I am trying to train DisCo, but it costs the cuda memory too much especially when I increase the number of positive and negative number (N and K). And it seems that declining BATCHSIZE and distributing it cannot help.