Making internal time representation integer in units of nanoseconds.

btasdelen commented 1 year ago

Motivation: My observation is one of the most bug prone part of the code base is where we deal with floating point operations on time. This is already discussed in several places: #114 and https://github.com/pulseq/pulseq/issues/25. In summary, due to the finite (and variable) floating point precision, every floating point operation is prone to errors. However, this is a no issue for many operations, as the precision is much more than needed. This is the case for gradient and RF waveforms.

However, unlike waveforms, time vectors are not bounded, they may need to represent very little fractions (less than a microsecond) and very large numbers (hours, days?) as a single variable. Floating point is not designed for this, as it loses precision as the number itself is getting large.

This is where the fixed-point representation comes in. Luck for us, we can just use nanoseconds as the time unit, this way we can store it as a 64-bit signed integer, which should both give us a physically meaningful unit for the representation, while providing a 1ns of fine step size and ability to represent very large numbers without lose of precision. 64-bit signed integer gives us more than 290 years in range, so it should be fine.

Aim of this PR: Before committing to the idea and spending too much time on it, I modified a small part of the waveforms_and_times() to show-case how a potential migration would look like. There are many moving parts that need to be taken care of, so it would make more sense to migrate it over time, while making sure everything works correctly. Note that, this commit still stores the time as a float, because making it integer requires me to change the data types. If this plan makes sense to people, I can start to apply the changes under this PR.

FrankZijlstra commented 1 year ago

I think one thing to be aware of is that you can still incur floating point errors with rounding operations while migrating. That is, with floats you have issues like a+b == a and a+b+c != a+(b+c). With rounding this can still occur: np.rint(a+b) == np.rint(a) because the floating point error happens before the rounding, and np.rint(a+b+c) != np.rint(a+(b+c)) if the error happens exactly on the rounding point. Of course this second case is orders of magnitudes less likely to happen, but that may mean you introduce very rare, obscure bugs. It is possible that because most of these numbers are rounded on their respective raster times, that some of this is already avoided...

Other than that, I think even with an incremental migration it would make sense to already remove float representations wherever possible, so you don't have to go back and fix things once more integer code is in place. Here, for example, rf_pieces and wave_data can not be represented as a singular 2xN numpy array, but instead should be a (named) tuple of two arrays (or even a Waveform class). And only at the end of the function convert back to the 2xN format while dependent code is not fixed.

btasdelen commented 1 year ago

@FrankZijlstra Yes, but aim is to have it almost always as integer, and only convert to float when it is absolutely necessary, or user asked for the time vector. That should alleviate that kind of issues as well.

Having a dictionary or a tuple for waveform (example usage waveform['gx']['t']) instead of a numpy array was discussed and found more Pythonic and user-friendly, but I never got out and did it. This might be a good opportunity to do it as well.

sravan953 commented 1 year ago

Hi @btasdelen , thank you for this. Moving to integer representation of nanoseconds is a great idea. Do you have a more complete plan mapped out i.e. how many of us will work on this translation and corresponding responsibilities?

zhengliuer commented 9 months ago

Any progress on this topic? Just out of curiosity.Using nanoseconds as time unit is a great idea.

btasdelen commented 9 months ago

@zhengliuer Currently, there is some work going on to refactor core components of PyPulseq. I was putting this off until the refactoring is done to avoid conflicts, but I can revive this if the refactor takes longer than anticipated.

zhengliuer commented 9 months ago

All right！ Looking forward to that.

btasdelen commented 9 months ago

@zhengliuer out of curiosity, do you have an issue that you think this PR will solve? That may inform the implementation.

zhengliuer commented 9 months ago

@zhengliuer out of curiosity, do you have an issue that you think this PR will solve? That may inform the implementation.

I am just worrying about some float precision error, especially doing time checking.

schuenke commented 1 month ago

I encountered another problem related to the time representation when extracting the waveforms from a Sequence object. In Sequence.waveforms() the waveforms of the gradients are calculated depending on the grad type. For arbitrary gradients (that always have time points of [0.5 grad_raster_time, 1.5 grad_raster_time, ...] a grad.first and a grad.last value should be added.

Arbitrary gradients should be detected here: https://github.com/imr-framework/pypulseq/blob/80f5d582b8b0507bdf04f64638fe686a0eec8f3e/pypulseq/Sequence/sequence.py#L1523

but due to rounding issues, my arbitrary spiral readout gradients are not detected as arbitrary gradients. This is one of several reasons why Sequence.calculate_kspace() leads to wrong results. Changing eps to 1e-12 fixes the issue in my case, but in a different function and/or for different/longer gradients, it might be an even bigger number.

FrankZijlstra commented 1 month ago

An overarching problem here is that the eps value in pypulseq is defined as np.finfo(np.float64).eps, which is only valid at 0.0. In various bits of code I committed, I have used different values (usually 1e-10 or 1e-12) to make it work properly. This is not very pretty, of course...

Perhaps we should redefine eps to be an arbitrary fudge-factor (e.g. 1e-10) that we then use everywhere?

I see more checks in waveforms related to timing and gradient amplitudes that use eps, which would fail if those values were calculated with even the tiniest rounding error...

schuenke commented 1 month ago

An overarching problem here is that the eps value in pypulseq is defined as np.finfo(np.float64).eps, which is only valid at 0.0. In various bits of code I committed, I have used different values (usually 1e-10 or 1e-12) to make it work properly. This is not very pretty, of course...

Perhaps we should redefine eps to be an arbitrary fudge-factor (e.g. 1e-10) that we then use everywhere?

I see more checks in waveforms related to timing and gradient amplitudes that use eps, which would fail if those values were calculated with even the tiniest rounding error...

Definitely a good idea to define a common constant for all functions for now. Presumably a value of 1e-10 is also perfectly adequate, right?

This does not mean that we reject the proposed idea of @btasdelen. We can still tackle this during a potential code refactoring (see my post in the PyPulseq Slack channel today)

btasdelen commented 1 month ago

Maybe two-three eps values (i.e. eps_time, eps_grad, eps_rf), rather than one? These require different precisions.

I have been putting this off, because with the current state of the codebase, there are many parts that need to meticulously modified and kept track of to make sure nothing is left off as float time representation. A more object-oriented refactor of the base would make it easier to implement as well.

FrankZijlstra commented 1 month ago

Maybe two-three eps values (i.e. eps_time, eps_grad, eps_rf), rather than one? These require different precisions.

Yes, I agree that this would make sense. In essence we would have to determine the minimum/maximum value we expect to encounter and look up the eps value there, e.g.:

In[7]: np.spacing(1e6)
Out[7]: 1.1641532182693481e-10

Note that this still needs a fudge-factor because rounding errors can compound. To be safe I'd suggest a factor 10. So with a max value of 1e6 I'd still choose 1e-9 as eps, for example. All this shouldn't be much issue because in saving the .seq file rounding happens anyway. In the unit tests for reading/writing sequence files, acceptable differences for the k-space trajectories were in the 1e-3 to 1e-4 range...

imr-framework / pypulseq

Making internal time representation integer in units of nanoseconds. #129