OpenPIV / openpiv-python

OpenPIV is an open source Particle Image Velocimetry analysis software written in Python and Cython
http://www.openpiv.net
GNU General Public License v3.0
247 stars 143 forks source link

Scaling adjustments for windef and output notation #266

Closed hifabian closed 2 years ago

hifabian commented 2 years ago

I adjusted the scaling such that the velocities are in meters per second and not meters per frame (done in old and original windef, I am not sure why not in the current). Additionally, to make full use of this change I changed the output to use scientific notations.

In particular, I had the problem that the velocity scales where completely off. When corrected, I run into the problem of experiments with low velocities having these cut off. These changes fix these issues.

alexlib commented 2 years ago

Thanks @hifabian - your contributions are reviewed and accepted. You're right - if we keep dt in the settings, it shall be applied. Could you please improve also the saving procedure into a file with headers - better to keep somewhat simple header but one that reflects the fact that the data can be saved in different units

hifabian commented 2 years ago

I am not entirely sure what you mean by units and a header, @alexlib.

OpenPIV does not know what units were used by the user, unless an option is added for a user to state this information. However, I do not think that adds much value as this is just storing user data in output files. Since the user should know this anyway, it just bloats the data files.

I was probably a bit misleading by calling it "seconds". Really, this change mirrors the 'scaling' settings that allows to convert from pixels to distance (e.g., meters). The 'dt' settings should convert from frames to time. Whether a user really associates with the scaling meters and and with the 'dt' seconds is not that important. (I use millimeters and seconds, for example.)

I should point out that the change in output format is inaccessible to users. This might be too aggressive a change. I can instead add a settings option for this and revert the change in 'tools.py' and the test. Should I do that instead?

timdewhirst commented 2 years ago

@alexlib the topic of file structure and associted metadata is one that I'd be interested in discussing - I've not looked at the current offerings in a long time, but am assuming there isn't as yet a standard output fileformat. Are you aware of any resources that document/list current PIV file formats and their capabilities?

Apologies for derailing the thread (slightly)!

alexlib commented 2 years ago

I am not entirely sure what you mean by units and a header, @alexlib.

OpenPIV does not know what units were used by the user, unless an option is added for a user to state this information. However, I do not think that adds much value as this is just storing user data in output files. Since the user should know this anyway, it just bloats the data files.

I was probably a bit misleading by calling it "seconds". Really, this change mirrors the 'scaling' settings that allow converting from pixels to distance (e.g., meters). The 'dt' settings should convert from frames to time. Whether a user really associates the scaling meters with the 'dt' seconds is not that important. (I use millimeters and seconds, for example.)

I should point out that the change in output format is inaccessible to users. This might be too aggressive a change. I can instead add a settings option for this and revert the change in 'tools.py' and the test. Should I do that instead?

Thanks for the reply. You're right about "drastic changes" - we can make those only after informing the community (on the openpiv-users forum and here) and after several reviews. However, if we added the option to scale or not to scale the output, it can now be all the possible combinations: pix/dt, pix/s, mm/s, mm/dt, m/s cm/s, and so on. I agree that the user might remember what scaling was used, but maybe a header would be useful.

commercial software use headers in its files. To save space, they typically use binary storage. We decided to stick with simple solutions - so we can keep it this way. Maybe instead of headers, one shall save some metadata next to the stored files so the settings used and dt, scaling is preserved.

alexlib commented 2 years ago

@alexlib the topic of file structure and associted metadata is one that I'd be interested in discussing - I've not looked at the current offerings in a long time, but am assuming there isn't as yet a standard output fileformat. Are you aware of any resources that document/list current PIV file formats and their capabilities?

Apologies for derailing the thread (slightly)!

There is no standard so far. I know files that come from Insight (TSI), Davis (Lavision), DynamicStudio (Dantec) - all have headers so every file contains a lot of duplicate information, but this way you're sure the data is redundant and saved with the vectors.

BTW, we already have discrepancies - in the past we started with 4 column ASCII .txt files (x,y,u,v), then we renamed them into .vec files without changing the format and then in windef some header was introduced and later we added 5th column for signal to noise ratio value (not a flag) and then we switched to 6 columns x,y,u,v, user_mask, flags

timdewhirst commented 2 years ago

In terms of supporting additional output formats, or evolving the output format, the solution is to make the output creation modular:

T = TypeVar('T')

@dataclass class 2D2C: location: Point2D displacement: Vector2D signal_to_noise: float

@dataclass class ProcessedData(Generic[T]): """struct to hold processed data from a trivially scaled 2D cartesian space""" metadata: str data: list[T] delta_t: Microseconds origin: Point2D x_scaling: MetersPerPixel y_scaling: MetersPerPixel ...

- have a function that will convert the ProcessedData to the output format e.g.
```python
def write_processed_data_to_simple_ascii(data: ProcessedData[T], buf: io.IOBase):
  for d in data.data:
    buf.write(f'{d.location.c} {d.location.y} {d.displacement.x} {d.displacement.y}\n')

Code is indicative, hopefully it gives an idea. I personally prefer over-specifying metadata, and in today's computing landscape storage is incredibly cheap and compression very good; you'll get a long way by compressing ASCII output files.

alexlib commented 2 years ago

I support organizing the code the way proposed by @timdewhirst , but I think it has to include metadata - otherwise we'll have many different output files and it won't be clear which files are of which type. So if we create similar pipelines, all of those should have the header and the users that read the data, know from the header (metadata) whatever is known about the vector field: names of columns, order of the columns, units, dT of the experiment if given, scaling value if given, etc.