rafaqz / Flatten.jl

Flatten nested Julia objects to tuples, and reconstruct them later
Other
32 stars 4 forks source link

DOI? #14

Closed ConnectedSystems closed 4 years ago

ConnectedSystems commented 4 years ago

Hello,

Wondering if you are inclined towards getting a DOI (e.g. through https://zenodo.org/)?

I'm involved in writing an overview of issues related to the treatment of scale and uncertainty in environmental modelling. I'd like to cite this as an example of recent advances towards resolving some of the socio-technical issues.

Alternatively, is there already a publication in which this (and related) package is introduced?

rafaqz commented 4 years ago

That is very interesting. This package has been pretty low on my list of packages to write up or make citable, but that is possible.

What are you using as an example of it's use? We use it for everything internally in my research work, but not many of the applications are widely used or even registered yet. I have a bunch of unregistered modelling packages that leverage this for interfaces and optimisation like DynamicGridsInteract.jl, GrowthMaps.jl and Dispersal.jl, but they are a way off being good examples for a paper.

You should also keep inn mind that it's not at 1.0 and the techniques used are objectively weird. Really, ridiculously weird, if you read the code.

rafaqz commented 4 years ago

Ok it's registered. It will have a DOI when the release is made.

ConnectedSystems commented 4 years ago

Apologies in advance for the long response.

I'm hoping to use it as an example of how differences in the design of "models as software" may be reconciled with the "models as research/learning tools" paradigm.

Here you wrote:

"The problem with using composition is that numerical tools need parameters to be provided in flat vectors, not nested in hierarchies."

This definitely struck a chord with me.

My area of research is integrated (multi-system) modelling. What this means in a practical sense is that I couple together models representing different systems (broadly categorised along hydrological, ecological, agricultural, climatic, policy and social aspects) for the purpose of investigating policy and management options. As you're probably aware this gets very messy, very quickly, in terms of dealing with input values as a cohesive set.

So on one hand the nested hierarchies usually employed in software development is not very compatible with the processes that modelers have in mind and the kind of analysis I/they intend to do. Yet the approaches and model designs that are in typical use (based on my experience) isn't very conducive to large-scale sensitivity, uncertainty, and exploratory analysis (i.e. beyond applying the these analyses for a single model). There is also the issue of semantics being confused across disciplines - different meanings for same words, or different units (ML/Day vs L/Day) in typical use which hamper integration attempts.

There are also concerns about model reusability and replicability as I'm sure you're aware. I see this effort as one potential approach to bridging the two paradigms.

There are efforts other than Flatten.jl (and FieldMetadata.jl) of course which attempt to resolve these issues (e.g. Basic Model Interface with the CSDMS Standard Names project) but these don't aim to facilitate automated composition of models/values (at least as far as I am aware).

Anyway, that's enough from me - thanks for registering for a DOI!

rafaqz commented 4 years ago

Sounds like a good paper! Although I mostly work on building specific models I also spend time thinking about how modelling is done in ecology and related fields. I have a paper in process about the structural impediments to collaboration and model composition in ecological modelling practice.

Also I hear you on the software/learning tool divide! One of my main aims in my work is to make our models responsive learning tools we can use interactively and change very easily whenever we have a new idea. Flatten.jl is the main component that facilitates that - by removing the overheads to trying new things.

Anyway, maybe we should collaborate on something one day.

Edit: I think the difference between the Flatten.jl/FieldMetadata.jl approach and other approaches you mention is the total commitment to doing things generically, instead of via agreements and standards - which I think will always be too centralised and limiting in one way or another.

ConnectedSystems commented 3 years ago

Hi @rafaqz

Just a heads up that the paper I spoke of was accepted recently and should be out soon. If you'd like I can send you a copy when it is out.

Thanks again

rafaqz commented 3 years ago

Congrats! That would be great :).

BTW I've come up with a better solution for using parameters with Flatten.jl that drops the use of FieldMetadata.jl. It doesn't have global state and is easy to import from CSV. You define a struct for parameters and metadata you need:

struct Param{V,B}
    value::V
    bounds::B
end

Then use Param(val, bounds) for your model parameters however you construct it.

Then with Flatten.jl flatten just the Param objects:

bounds = map(p -> p.bounds, Flatten.flatten(parammodel, Param))
valuesonlymodel = Flatten.modify(p -> p.value, parammodel, Param) 

And extend/modify for other metadata like priors etc.

rafaqz commented 3 years ago

BTW @jamesmaino the above code is what I mean to swap our models over to at some point. It's so much simpler.

ConnectedSystems commented 3 years ago

Here's the link to the article - an allowable copy will be put up on ResearchGate too.

https://www.sciencedirect.com/science/article/pii/S1364815220309427

Incidentally, the approach you described above is similar to what I ended up doing, but I see that you've abstracted a lot of stuff into ModelParameters.jl

I'll have to find the time to see I can adopt your package instead. Trouble is my work with Julia is very much an on again off again thing at the moment.

One issue I ran into is that in my case I can have arrays/dicts of objects that themselves hold arrays/dicts of Param-like objects, and Flatten didn't want to handle that nested structure. Not sure if this still happens though, just something I ran into a while ago.

rafaqz commented 3 years ago

Thanks! I look forward to reading it.

I'm just abstracting ModelParameters.jl out right now actually. We should be able to get a free Tables.jl interface (which includes csv interop for free) and a nearly free Interact.jl interface out of it. I was going to ping you once its ready for some feedback.

Flatten can't handle Dicts or arrays as it's all compile-time code, and those have run-time determined fields and lengths. It's definately a limiting factor in some contexts. After starting with Dict originally, my sense is that Tuples, NamedTuples and especially structs lead to a better design, as you can use multiple dispatch methods on them so any of your components can be replaced by users ad-hoc with their own code with no performance overhead. They're also much faster to index.

DynamicGrids.jl/Dispersal.jl/GrowthMaps.jl and DynamicEnergyBudgets.jl/Photosynthesis.jl and now all of the new Clima land model packages are good examples of this kind of structure. They're also all required to have very high performance, so that may actually be the niche I'm aiming for using Flatten and in ModelParameters.jl - maybe it should be HighPerformanceModelParameters.jl!! But thats too long.

Flatten being compile-time adds no overheads in these modelling contexts - while iterating over arbitrary Dicts is comparatively slow as everything has to be done at run time. Dicts are also slow to index into - which is the reason I ended up using this strategy in the first place! - indexing the dict was most of the model run time in DynamicEnergyBudgets.jl.