Open nelimee opened 2 months ago
We can think about utilizing the existing sampling tool like sinter
. But as far as I know, currently there is no API provided by sinter
to store the intermediate sampled detectors/observables to files.
We can think about utilizing the existing sampling tool like
sinter
. But as far as I know, currently there is no API provided bysinter
to store the intermediate sampled detectors/observables to files.
Yep, the goal of this issue is not the generation (which will very likely be handled by sinter
as you note) but rather the storage of generated results.
Also, even if sinter
had the possibility to store to files, we would need to have a clear organisation to allow easy retrieval, modification and deletion, so in any case we will need at least helper methods to do that.
Note that it looks a lot like the work done by a database, that might be a path to the solution.
Craig: can you comment on how Stim/sinter simulation results can be systematically stored so that one could later gather additional data for a plot to improve its statistics or explore a wider range of code distances and error rates?
On Fri, Jul 26, 2024 at 1:14 AM Adrien Suau @.***> wrote:
We can think about utilizing the existing sampling tool like sinter. But as far as I know, currently there is no API provided by sinter to store the intermediate sampled detectors/observables to files.
Yep, the goal of this issue is not the generation (which will very likely be handled by sinter as you note) but rather the storage of generated results.
— Reply to this email directly, view it on GitHub https://github.com/QCHackers/tqec/issues/273#issuecomment-2252212612, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAKAXTEDMPTVC5TETCVFNTTZOIAOVAVCNFSM6AAAAABLOFRWF2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDENJSGIYTENRRGI . You are receiving this because you are subscribed to this thread.Message ID: @.***>
Craig: can you comment on how Stim/sinter simulation results can be systematically stored so that one could later gather additional data for a plot to improve its statistics or explore a wider range of code distances and error rates?
Whenever I have a task like that, I really follow the database point of view:
In this specific case, I think that the primary key will be composed of:
an algorithmically generated (hash-like) key representing the experiment being benchmarked. For the moment, with the limited use-cases we explicitly target, I guess that we can compute such a hash (or a unique value if we really want to avoid any collision) by only considering:
These can be directly obtained from the SketchUp file representing the computation and should be:
k
(determining the size of our logical qubits, and code distance),e = powerOfTenMantissa * 10**(-negativePowerOfTen)
as a tuple (powerOfTenMantissa, negativePowerOfTen)
where 0 <= powerOfTenMantissa <= 1
can be represented as a fraction.The data stored will have to include the outputs of stim
simulations (depending on what we need, direct measurements or detection events), and I think some metadata could be added to such a value such as:
In terms of format, and because the main data we will store is binary anyway, I do not have any preferences and it can be anything (a real database, a file/folder-based storage, ...).
Sinter always hashes the circuit it was asked to simulate and the decoder it was asked to use, producing a cryptographically strong id. This id is stored alongside any statistics. When you merge multiple files, you match up statistics by this id when deciding whether or not to combine two entries into one entry.
I don't think "how to store stats" is particularly important to the goal of input-skeleton-output-circuit. That's later.
On Fri, Jul 26, 2024 at 6:32 AM Austin Fowler @.***> wrote:
Craig: can you comment on how Stim/sinter simulation results can be systematically stored so that one could later gather additional data for a plot to improve its statistics or explore a wider range of code distances and error rates?
On Fri, Jul 26, 2024 at 1:14 AM Adrien Suau @.***> wrote:
We can think about utilizing the existing sampling tool like sinter. But as far as I know, currently there is no API provided by sinter to store the intermediate sampled detectors/observables to files.
Yep, the goal of this issue is not the generation (which will very likely be handled by sinter as you note) but rather the storage of generated results.
— Reply to this email directly, view it on GitHub https://github.com/QCHackers/tqec/issues/273#issuecomment-2252212612, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAKAXTEDMPTVC5TETCVFNTTZOIAOVAVCNFSM6AAAAABLOFRWF2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDENJSGIYTENRRGI . You are receiving this because you are subscribed to this thread.Message ID: <QCHackers/tqec/issues/273/2252212612 @.***>
Is your feature request related to a problem? Please describe. The goal of our initiative is to generate graphs such as
In the above graph, each point is:
tqec
for a given value ofk
, and that can be represented as astim
file,stim
.One problem is that
Stim
simulations are not free, and computing one point from the above graph can take minutes to hours of computational time.Currently, we have no clever way of storing such data, meaning that the
stim
simulations have to be re-done each time we want to generate a new graph.Describe the solution you'd like
We should have a database-like way of storing simulation data. There are multiple requirements:
Note that simulation results might be quite heavy in terms of memory, so an optimised storage would be a plus.