Closed AxelHenningsson closed 5 months ago
@jadball has a panda's implementation. I don't want to merge something until we have a back end for polars and/or dask distributed too.
In the meantime, I guess it is easy to patch the setattr/setitem to get what you want?
Fantastic @jadball!
I suggest we push the pandas implementation as a new class while we a deprecating the old column file - this is also helpful for smoking out any bugs.
@jon Is there not already such a support in dask and polars?
It is the silent part that worries me. I would suggest that we at least raise an error for things that are not implemented.
Anyways - the two of you are way more core than I on this one - I fully trust your decision.
Cheers Axel
@jadball code is here https://github.com/jadball/ImageD11/tree/colfile_pandas
There are a bunch of other dataframe libraries listed here : https://github.com/data-apis/dataframe-api/blob/main/spec/purpose_and_scope.md . Seems to be unclear what is compatible or not.
There may also be arguments for looking at xarray https://docs.xarray.dev/en/stable/getting-started-guide/why-xarray.html, for example, to make g-vectors be a (nx3) vector type, etc. And add units.
For me, it is not clear whether we need a new object deprecating columnfile (means changing a LOT) of old code.
Subclassing columnfile to have PandasColumnfile, etc looks promising right now.
I will work on updating my fork to make pandas_columnfile a subclass of columnfile, then we'll try to open a PR for it
PR merged #290
Problem
While the
ImageD11.columnfile.columnfile
object promise to mutate my data in place 👍Mutation by reference will silently fail 😨
Reason
The
ImageD11.columnfile.columnfile
object lacks an implementation of__setattr__
and__setitem__
that will mutate the underlying hidden data structures. I.e something likeThe Good Solution
Implement columnfile as a pandas dataframe. The
pandas.DataFrame
is basically columnfile with a bunch of stuff already implemented and extended. It is a well maintained, standard and fast library. Mutations will then work out of the box.The Bad Solution
Overwrite and
__setattr__
and__setitem__
and callsetcolumn
internally before attribute assignment.The Worst Solution
Make
ImageD11.columnfile.columnfile
throw an error whenever class members are being mutated and refer toaddcolumn
orsetcolumn
for data mutation.