Open WrathfulSpatula opened 2 years ago
Seeing as (according to the above) all we want this for is QEngineCPU
, it could be furnished by a generically typed QEngineCPU
, for state vector floating point precision. It would require cross-precision Compose()
/Decompose()
/Dispose()
overloads, for QUnit
.
Since Qrack contains a portable IEEE fp16 definition in an OSS header, we can safely assume that fp16 is also present. Then, no fp types higher than build -DFPPOW=[n]
will be referenced in QEngineCPU
or anywhere in the library. (Practically, there are some systems have float
available, but not double
.)
If Compose()
defaults to this
pointer precision in up-cast, we can toggle Compose()
order and position between front and back to get the appropriate resultant precision. Decompose()
doesn't need to down-cast, because, for QUnit
precision hybridization purposes, we'll include on-demand up-cast/down-cast methods.
(EDIT: Actually, we'll just let QUnit
precision hybridization generally use on-demand up/down-cast.)
Then again, fusing the operations of up-casting with Compose()
might be critical for practical payoff. This is tricky at the boundary between CPU/GPU, but the general optimization would be practical if CPU/GPU hybridization was above 16qb. Given that consideration, which is not typically satisfied, this is momentarily on backlog.
Basically, Qrack accepts as standard that fp16 is naturally accurate up to 16qb, fp32 is valid up to 32qb, fp64 is valid up to 64qb... etc..
If we allowed precision to be mixed in the same build, we could size floating point precision according to
QUnit
separable subsystem size. This would require up-casting and down-casting, but it might make sense at the fp16/fp32 boundary for (hybrid) CPU simulation.