Open Tokazama opened 1 year ago
This seems fine to me? @bkamins any concerns?
I think it is OK to add. I assume @Tokazama has some concrete use case (i.e. I would avoid adding functionalities that will not be used in practice). I have went through the code and left some small comments.
I think it is OK to add. I assume @Tokazama has some concrete use case (i.e. I would avoid adding functionalities that will not be used in practice). I have went through the code and left some small comments.
Besides the use cases in the original post I have images that have metadata per dimension in the form of name, per dim machine acquisition information, and additional metadata along axes (similar to colmetadata).
All modified and coverable lines are covered by tests :white_check_mark:
Comparison is base (
b131356
) 96.36% compared to head (e1b08c5
) 96.77%. Report is 1 commits behind head on main.:exclamation: Current head e1b08c5 differs from pull request most recent head 6a0af43. Consider uploading reports for the commit 6a0af43 to get more accurate results
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
@quinnj - all looks good. OK to merge? (I would make a release next)
Anything needed on my end?
Needs a second approval (which might require some changes, but most likely not) before being merged and released.
@nalimilan - do you have an opinion on this PR?
Why not write 1 instead of using a variable? That sounds clearer to me.
The column metadata tests use :col
as the argument for which column is being accessed. I thought using the variable dim
instead of just 1
would make it easier to quickly read and understand what exactly was being tested.
Question on the intended use: is this PR is about storing information about a dimension itself (e.g. its name or unit), or about its elements (e.g. row names for dimension 1), or both? If the second option, I wonder how this interacts with the existence of column metadata, as columns are usually considered as dimension 2.
Also, how does this relate to AxisArrays, NamedArrays, NamedDims, etc.?
I wonder how this interacts with the existence of column metadata, as columns are usually considered as dimension 2.
I assumed this is completely detached. I.e. colmetadata
is for objects that implement Tables.jl interface. And this PR is for array-like objects. Note that array-like objects can be re-interpreted as tables in many different ways.
The way to support metadata per element along a dimension with this would be a specific "indices" or "axis" key whose value can be used somewhat like a tables column metadata.
I assumed this is completely detached. I.e.
colmetadata
is for objects that implement Tables.jl interface. And this PR is for array-like objects. Note that array-like objects can be re-interpreted as tables in many different ways.
@bkamins Sure. But e.g. for a DataFrame
we could imagine implementing dimmetadata
at some point given that we define ndims
, axes
, getindex
... So I wanted to make sure we consider whether that would make sense.
The way to support metadata per element along a dimension with this would be a specific "indices" or "axis" key whose value can be used somewhat like a tables column metadata.
@Tokazama I guess that would work. Then you mean that the intended use case for this PR is both metadata about a dimension itself and metadata about its elements (rows, columns...), right?
I assumed this is completely detached. I.e.
colmetadata
is for objects that implement Tables.jl interface. And this PR is for array-like objects. Note that array-like objects can be re-interpreted as tables in many different ways.@bkamins Sure. But e.g. for a
DataFrame
we could imagine implementingdimmetadata
at some point given that we definendims
,axes
,getindex
... So I wanted to make sure we consider whether that would make sense.The way to support metadata per element along a dimension with this would be a specific "indices" or "axis" key whose value can be used somewhat like a tables column metadata.
@Tokazama I guess that would work. Then you mean that the intended use case for this PR is both metadata about a dimension itself and metadata about its elements (rows, columns...), right?
Exactly! I figured the alternative was having an additional metadata set of methods (e.g., axesmetadata
) which seemed a bit over the top since one can still accomplish similar functionality with the dim methods here. The overlap with column metadata could also be seen as redundant but tables have consistently proven to be very unique since each column can be conceptualized as its own entity.
@nalimilan, is this okay now?
The core metadata methods (
metadata
,metadata!
, etc.) provide an intuitive method of communicating that a key-value pair provides some information about the instance of the object it is attached to. The column metadata methods similarly provide an intuitive method of communicating the a key-value pair provides some information about the column of a particular object. However, data with multiple dimensions (array, mesh, coordinates, etc.) often have metadata stored per dimension.This PR provides support for metadata corresponding to specific dimensions. New method implemented here follow the name pattern of column metadata methods (
colmetadata
todimmetadata
).