HPCE / hpce-2016-cw5

0 stars 2 forks source link

Seed in ising_spin puzzle #20

Closed SantiagoRubioDev closed 7 years ago

SantiagoRubioDev commented 7 years ago

The seed in computed recursively throughout the code which makes it inconvenient for parallelism. I tried finding a function that could give me the value of the seed at any iteration. F(x+1)=aF(x)+b with F(0)=c ==> F(x)= (pow(a,x)((a-1)*c+b)-b)/(a-1) the problem is that I cannot use integers anymore because we have a exponential and a division. I used double instead for each variable a,b and c and then cast everything to uinit32_t. My problem is that when I cast a very big double (>10^19) to uint32_t I get 0. I guess that is due to how doubles are implemented in IEEE: image the & operand does not work on double, so I haven't found a way to select specific bits. How would I do to explicitly only take the bits that interests me to make a 32 bit number? Am I on the right track or am I trying to do something that cannot work? Should I abandon this idea and try to find another way to have parallelism?

m8pple commented 7 years ago

You're thinking in a very promising direction, and applying existing maths to the problem is a good way of getting "jump ahead" like we had with the FFT.

However, you're also thinking too much in terms of the reals (very much an EEE view :), and unfortunately doubles won't behave enough like reals, even if you can get out the bits.

Notice that the recursion is:

x_{i+1} = x_i * a + b  mod  2^32

so it is not over the reals, it is over integers modulo 32. The problem is then how to do 1/(a-1) modulo 2^32. It can be done, but might be more complex if you haven't done modulo rings (which I don't expect you to have).

Instead, you might want to think of it in matrix form. If you were to think in terms of a column vector:

| x_{i+1} |    =    A  | x_i |
|    1    |            |  1  |

What matrix would allow you to get from one vector to the next. Once you have that matrix, how can you jump n steps in less than n operations?

narobertson42 commented 7 years ago

@m8pple Would I be getting warmer and colder for matrix solution if I was looking at https://en.wikipedia.org/wiki/Exponentiation_by_squaring? #logn

m8pple commented 7 years ago

That does indeed look appropriate.

The difficulty is then balancing the cost of a slow jump over a long distance, versus the very fast but sequential single-step recurrence. Same problem as in FFT - you need chunks big enough to amortise the cost of the chunking, but not so big that there is no parallelism.

I should note that this isn't the only solution, just in case anyone sees this and thinks this is the answer. There are other interesting SW+GPU hybrid approaches that doesn't need the fancy skipping maths, and are technically less parallel, but might be better in practise (we live in a world of finite n).