Closed kdheepak closed 11 months ago
struct Data{T}
variable::T
end
Thanks for the answer! If all the types of my struct were unique, this seems very impractical for me to write:
struct Data{T1, T2, T3, T4, T5, ..., T198, T199, T200}
variable1::T1
variable2::T2
variable3::T3
...
variable200::T200
end
They are not all unique so it is not going so many types but it is still going to be a couple dozen of so.
struct Data{T1, T2, ..., T20}
variable1::T1
variable2::T9
variable3::T20
...
variable200::T5
end
This seems like it is going to be a very error prone for users that I'm working with.
Is there another way you think I can go about this?
If you have that many arrays, why not use a macro to define you struct?
But essentially, if you want type stable structs, you have to use type parameters like that. That just julia, not DimensionalData.jl.
You would not manually specify the type of most other AbstractArray
from packages, the exact type is often not part of the interface.
Also, what you have looks like a DimStack
, which is just as fast (as a fully type stable version) but more organised that your struct.
I guess I was thinking I only need to use type parameters if I want the struct to be generic. What I would really like to do is this:
const T1 = create_type_on_the_fly(:Enduse, :Tech, :EC, :Area, :Year)
const T2 = create_type_on_the_fly(:Enduse, :Tech, :EC, :Year)
...
const T20 = create_type_on_the_fly(:A, :B, :C, :D, :Year)
struct Data
variable1::T1
variable2::T9
variable3::T20
...
variable200::T5
end
My current workaround was to define something like this:
function create_type_on_the_fly(dims...)
values = (get_categorical_data_for_dimension(d) for d in dims)
arr = zeros(Float64, length.(values)...)
nt = NamedTuple{dims}(values)
typeof(DimArray(arr, nt))
end
I was hoping for a better way.
In our case, we probably don't even need a single struct. Everything is in a HDF5 file, and we can probably read and write the data directly from a HDF5 instead of reading it into a struct and writing it back from the struct. I think we'll run into type stability issues there too though.
What do you imagine a macro might look like? I know how to write macros but I'm not exactly sure how a macro will help here? Because it is not just syntax transformations right? I don't know the type that needs to be used.
I also am not sure if there'd be performance issues for defining a generic struct with so many type parameters (~20-30 at the moment, may increase) and wanted to explicitly type everything for that reason.
I'm new to DimensionalData.jl
, I did see DimStack
in the documentation but haven't had the chance to play around with it yet! I'll check it.
Dont do that on_the_fly thing... just use type parameters {T} thats literally what they are, but cleaner.
And really DimStack
is what you want. Its a hybrid of a DimArray and a NamedTuple. The dimensions of all array layers must match, but they dont have to use all dimensions. That seems like what you are doing.
The dimensions of all array layers are not the same unfortunately. There's all sorts of combinations, e.g.:
(:Enduse, :Tech, :EC, :Area, :Year)
(:Fuel, :Tech, ...)
(:Fuel, :Pollution, ...)
Thanks for the suggestion on type parameters! I wanted something that I could quickly prototype with to see how DimensionalData.jl
fares with what our application throws at it; but maybe it is best to just use type parameters from the get go.
The other thing is I am already using @kwdef
so I didn't want to write another macro unless I could get it to compose well.
@kwdef Data
variable1::T1 = ReadFromHDF5("/group/variable1")
variable2::T2 = ReadFromHDF5("/group/variable2")
variable3::T3 = ReadFromHDF5("/group/variable3")
end
I can probably figure out how to do this though
I'll close this issue. I can reopen or create a new issue if I have more questions. Thanks for your swift responses on here!
The dimensions of all array layers are not the same unfortunately.
DimStack is made to handle that. If layers share dimension they have to match, but they dont need to share all or even any dimensions.
Ah that makes sense! Thanks for the clarification.
I have a
DimArray
with the following content:This is what I get with the
typeof(variable)
:in the data that I'm working with, the "dimensions" are always categorical data (except for
Year
which is the last dimension).Before using
DimensionalData.jl
, I was storing this in a struct like this:Now, with
DimensionalData.jl
, I want to store it in the struct:Is there a easy way to do this? If I do the following (i.e. not defining an explicit type), it ends up being a lot slower.
This is a simplified example. In the actual code I'm working with, there's more than 200 such variables in a single struct.