Obtaining Different results for the same code and data for different runs

google / TensorNetwork

A library for easy and efficient manipulation of tensor networks.

Apache License 2.0

1.81k stars 356 forks source link

Obtaining Different results for the same code and data for different runs #888

Open sr33dhar opened 3 years ago

sr33dhar commented 3 years ago

Hey Team TN,

I am using the tensornetwork package to create an MPS based Quantum Circuit Simulator and analysing the effect of truncated bond-dimensions on the circuits I am studying. Everything looked to be working fine except that in many instances, I get different results for different runs.

I cannot see where this source of randomness is coming from and would really appreciate any insight into this. I am using numpy backend and extensively using both the SVD operations to perform truncations of bond-dimension and also the contractors.greedy option to contract the network.

Unfortunately I am not in a position to share the code or method. Thanks you for your time and understanding! Your help would be highly appreciated!

mganahl commented 3 years ago

Hi @sr33dhar, it is unfortunately almost impossible to diagnose such a problem without any specific code example. How big are the differences that you are seeing? Is there any random initialization in your code? Did you try to set the seed of all random intializations?

sr33dhar commented 3 years ago

Hi Martin, Thank you for your prompt response!

Let me try to find a code snippet that I can post publicly. Also, no, there are no random initialisations.

Meanwhile, do you have any understanding of how "random" SVD operations are with numpy backend? There have been multiple instances for me in the past where during one run, the SVD does not converge, but running the same code again without any changes results in the SVD converging. But at those times, the results were still acceptable.

Thanks again Martin Cheers, Rishi

alewis commented 3 years ago

As Martin says it's not possible to diagnose a bug without an example that produces it.

The SVD with the NumPy backend just calls the NumPy SVD code. This should be deterministic.

sr33dhar commented 3 years ago

Hi @alewis,

I am quite new to this, so can you please elaborate on the NumPy SVD code being deterministic? Also, there have been instances where I have had NumPy does not converge errors, but rerunning the same code with the same data sometimes resolves the issue. I found a similar thread here: https://stackoverflow.com/questions/63761366/numpy-linalg-linalgerror-svd-did-not-converge-in-linear-least-squares-on-first

Thanks!

mganahl commented 3 years ago

Now that's a useful bit of information. The most likely reason the numpy SVD to throw this error is that the matrix contains NaN values. Sometimes you can also get it in single precision arithmetic for regular matrices, but I have only encountered this case once. Are you using single precision or double precision? Are you explicitly taking matrix inverses somewhere, or dividing by singular values without taking a pseudo inverse?

sr33dhar commented 3 years ago

Thanks, Martin!

About the precision, all the numbers are float64. I tried increasing it to float128, but then got an error that complex256 is not supported by numpy linalg.

Also, I am not taking any explicit matrix inversions or dividing by singular values. However, in the method, I am not normalising my states after truncation and hence the numbers can grow very small at times.

sr33dhar commented 3 years ago

Hi @mganahl and @alewis Sorry again for the delay in producing reproducible code of this issue and thanks for your patience!

mganahl commented 3 years ago

Hey @sr33dhar I just merged a patch for the infinite MPS, can you check if this fixes your issue? thanks!