qiboteam / qibocal

Quantum calibration, characterization and validation module for Qibo.
https://qibo.science
Apache License 2.0
32 stars 6 forks source link

Data handling and record arrays #1053

Open alecandido opened 21 hours ago

alecandido commented 21 hours ago

Most of the routines (especially sweeper-based ones) are handling acquired data by collecting them in arrays with structured data types, which are then dumped to disk.

This is often critical in two respects:

  1. data dumps are not always just raw data, but they often contain the result of some mild post-processing (or partially supplemented with some parameters values)
  2. there is quite some overhead connected to the management of record arrays, especially their creation

While 1. is also relevant, it may be explored by a different issue, as it is less technical, and more related to the individual protocol's structure.

Instead, I suspect that the second point is also related to a poor usage of the NumPy API for record arrays creation (which is fully wrapped by the np.rec.array constructor). In particular:

In general, we may reduce the custom handling of data by Qibocal, replacing it with more idiomatic usage of the NumPy API, possibly leading to a more vectorized treatment of data (fewer Python for loops), consequently reducing nesting (as functions, like Data/AbstractData methods, and blocks, i.e. the mentioned Python for loops).

alecandido commented 20 hours ago

@ElStabilini since you already hit the problem yourself, you may consider this (only after your current commitments), as a technical contribution. It is not physics-related, but it may help you familiarize more with the library (and the NumPy API itself), while giving a help to simplify Qibocal itself, which is invaluable (assuming it's possible...).