Closed pwborthwick closed 3 years ago
Thanks for finding this! Fixed in 97ff2f69
Just one other point. In the paper on using bit-strings with FCI they, because its Fortran are in trouble if the number of spin orbitals exceeds the 64 bit word size. Hence all the Nint stuff. I'm pretty new to python but do you really need it in mmd?
h = '0b1111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111'
print('Decimal representation...', int(h,2))
Decimal representation... 1267650600228229401496703205375
That's a string of 100 '1's and python seems to take it in it's stride. I'm missing something aren't I? Best wishes Peter
@pwborthwick You are absolutely right, Python can handle arbitrary precision integers. So in this respect, we don't need Nint
in mmd
at all. A really interesting read can be found here. One big takeaway is that it gets a little weird using numpy, so you do have to be careful sometimes. To extend your example:
import numpy as np
h = np.array([1267650600228229401496703205375])
print("h: ",h,"; type: ",h.dtype) # object
i = np.array([2**63 -1])
print("i: ",i,"; type: ",np.array([i]).dtype) # int64
print("h + i: ",h + i,"; type: ",(h + i).dtype) # object
print("h + 1: ",h + 1) # no overflow ... object
print("i + 1: ",i + 1) # overflow! ... int64
yields
h: [1267650600228229401496703205375] ; type: object
i: [9223372036854775807] ; type: int64
h + i: [1267650600237452773533557981182] ; type: object
h + 1: [1267650600228229401496703205376]
i + 1: [-9223372036854775808]
So if we aren't careful we could get an overflow. Numpy tries to be intelligent with the data types, but it can't anticipate all our use cases. If it represents integers as object
, there is no issue, but if it uses int64
then we have a problem because then it is relying on C-style ints. Now that you brought this to my attention, it might be a good idea to ensure that we are using unsigned 64-bit integers here, which I didn't think about too carefully. (The saving grace is that we should never encounter a case where we have more than 64 orbitals, since the code is too slow -- famous last words.)
OK, so why do I include Nint
? The reason is that I am using mmd
to quickly prototype C++ code, and C++ can't handle arbitrary precision integers. I know you already know this, but just to make the point:
int main() {
auto h = 1267650600228229401496703205375;
return 0;
}
won't compile, and dies with the compilation error
test.cpp:2:14: error: integer literal is too large to be represented in any
integer type
auto h = 1267650600228229401496703205375;
^
1 warning and 1 error generated.
So anticipating C++ needs within Python is more a useful functionality for me, as I develop in a different code base. Iron out the C++ (-ish) code first in Python. That, and I hate waiting to compile, so I find writing prototype code in Python first, and then translating to C++ is more efficient. It helps me think a bit more abstractly, as well as a bit more ruthless in tracking down bugs.
Hi @jjgoings , Thanks for the Python examples. It did cross my mind that you may be targeting a more general code-base than just Python. I originally coded the FCI etc using text bit strings because, as you say, these programs are for molecules where it wouldn't make much of a difference in speed. I've only just come back to this following a diversion after reading your SCI paper. I refreshed my memory on Epstein-Nesbet and while reading about PT I remembered a paper I saw on generating Hugenholtz diagrams in code...so I wrote a Python program that generates the diagrams and then automatically generates code (and executes it) to evaluate the diagrams. Anyway bearing your comments about text bit strings being too slow for generating SCI determinants on-the-fly I was going to implement a bit manipulation version which is why I queried the 64 bit thing. I will still go ahead with a bit version of my code (but with care re int64 and objects) just to get some practice on using bits. I must get back to biorthonormalizing eigenvectors at some point. I think even elementary row/column operations destroy the eigensolution nature of the matrix so you have to biorthonormalize using some iterative process so any method along the lines of your blog won't work. I've got transition dipoles, both electric and magnetic, in both length and velocity gauges and oscillator/rotary strengths working in the Tamm-Dancoff Approximation as a start. Thanks again for you explanations. Best wishes Peter
Hi Josh, just a small typo...
This is the wrong way round (eg for water 'taking 10 things 14 at a time'), it will always return 0 so 'if' is always False. You've got it the right way round a couple of lines down! Hope all well.