Open tomermerhav opened 4 years ago
I think that the fix should be:
float weightIJ = fWeights[threadIdx.y * BLOCKDIM_X + threadIdx.x]
because each block has its weight. It is meaningless to increment a counter inside the fWeights array, while cycling through every nlm window
when NLM_WINDOW_RADIUS is increased, the
idx
counter which indices the fWeights array, goes out of array range. see comments with //**