DC-analysis / dclab

Python library for the post-measurement analysis of real-time deformability cytometry (RT-DC) data sets
https://dclab.readthedocs.io
Other
10 stars 12 forks source link

Temporary features can be changed directly, although datasets are designed to be read only #202

Closed B-Hartmann closed 1 year ago

B-Hartmann commented 1 year ago

Software versions

Windows 10 Python 3.10.4 dclab 0.47.5

Description

I worked with temporary features and noticed that I can set values of a temporary feature without the function dclab.set_temporary_feature. This should not be possible, as RTDC datasets are supposed to be read-only.

It only works for temporary features in a non-hierarchy dataset, so when working with a hierarchy child, the error TypeError: 'ChildScalar' object does not support item assignment is shown. And trying to set values of a regular feature such as "area_um", the error TypeError: 'H5ScalarEvent' object does not support item assignment pops up.

Minimal working example

import numpy as np
import dclab

ds = dclab.new_dataset(r"path\to\example\data.rtdc")
dclab.register_temporary_feature("my_feature")

data = np.array([0] * len(ds))
dclab.set_temporary_feature(ds, "my_feature", data)
ds["my_feature"]
>> array([0, 0, 0, ..., 0, 0, 0])

ds["my_feature"][0:2] = 2
ds["my_feature"]
>> array([2, 2, 0, ..., 0, 0, 0])
B-Hartmann commented 1 year ago

Make clear in the docs that datasets are read-only. Suggested places are:
https://dclab.readthedocs.io/en/stable/sec_av_rtdc_dataset.html https://dclab.readthedocs.io/en/stable/sec_getting_started.html#basic-usage maybe here as well:
https://dclab.readthedocs.io/en/stable/sec_av_notation.html

paulmueller commented 1 year ago

Thanks for raising the issue. We might also have to check whether this is also the case for plugin/ancillary features...

paulmueller commented 1 year ago

The best solution would be to implement a general RTDCFeature class that inherits from np.lib.mixins.NDArrayOperatorsMixin. For the sake of simplicity, RTDCScalarFeature is probably a good start and covers all concerns raised in this issue. The H5ScalarEvent for the RTDC_HDF5 could be used as a template for the idea.

paulmueller commented 1 year ago

An easier solution is to set flag write=False for the numpy array or a view of it.