Closed martijnende closed 7 months ago
Sequence
from list
instead of dict
?The main difference would be the loss of descriptive keywords, and perform operations by selecting keywords. Since these keywords do not necessarily depend on the position of a given atom in the sequence, you could define modifications of a sequence in a more reusable way. Using indices rather than keywords makes the bookkeeping a lot simpler (no need for unique naming and duplicate checking). So we could reconsider this trade-off between "selectability" and code complexity.
The Keras sequences are not meant to be stored as recipes, which was one of the initial motivations for the xdas composability (from xdas.recipes import fk
giving you a predefined sequence). Users might want to modify a pre-defined recipe to suit their needs, which is where the sequence manipulations come in. Defining an order only at declaration time prevents user modifications.
Atom
and StateAtom
subclass partial
?What would we gain from subclassing partial
?
Maybe this would make sense for output handling: one Sequence generates one output, so if you want intermediate outputs you'd need to define multiple sequences, each of which are placed in a higher-level sequence. If not for the output, it would make no difference if sequences are nested or concatenated.
Maybe this would make sense for output handling: one Sequence generates one output, so if you want intermediate outputs you'd need to define multiple sequences, each of which are placed in a higher-level sequence. If not for the output, it would make no difference if sequences are nested or concatenated.
I was more thinking In a way of organizing the sequence in sub-parts. Like if you apply a FK then some other function, rather to have a very long list of atom, those would be organized. In other term an item of the sequence can be a sequence itself.
The Keras sequences are not meant to be stored as recipes, which was one of the initial motivations for the xdas composability (
from xdas.recipes import fk
giving you a predefined sequence). Users might want to modify a pre-defined recipe to suit their needs, which is where the sequence manipulations come in. Defining an order only at declaration time prevents user modifications
Yeah well I suspect that modifying a sequence will require as much line of codes than redefining it. As a user I would inspect the recipe, either use it as is or re-declare everything.
Well I still don't have in mind a use case for a very long recipes. A user will probably need to change many of its parameters.
I mean I do not see what we need more than in the keras case. People copy paste others models and change a little bit some parameters.
Its a matter of editing some code vs programmatically modify some objects. The first approach requires less work for us and look more straightforward. Keras also provide a way to store weights (as we would save state). So it looks like a good source of inspiration for me. But I'm ready to ear your point of view.
In terms of execution, nested Sequences and Atoms should have a similar behaviour (after implementing the __call__
method): db <- obj.__call__(db)
. The recursive execution is handled automatically. The only thing that would need to change, is managing the Sequence and sub-Sequence ordering, which pertains to the discussion of the Keras-style sequence.
Very long sequences would almost certainly be exclusively user-defined (we could probably quantify this probability based on the increasing entropy of longer sequences). So manipulating the sequence would only be useful for the xdas recipes. Maybe the choice of sequence mutability would depend on the kind of recipes we plan to provide?
Maybe the choice of sequence mutability would depend on the kind of recipes we plan to provide?
Yeah let's discuss what user case we want to cover and which kind of syntax we want next week. The implementation will naturally follow.
Outline
This PR adds a framework for composing complex data processing pipelines by chaining elementary operations. The motivation for introducing composability, is that experienced DAS analysts have already developed their preferred data analysis workflows, and are not likely to adopt new end-to-end workflows over which they have no control. So, instead of providing complex operations with no room for customisation,
xdas.Sequence
offers a framework for chaining together basic operations (xdas.Atom
s) in a user-specified order and with dedicated function arguments. This allows for enhanced optimisation at the level of individual atoms, as well as at the level of the entire pipeline, while the users retain the same flexibility as when creating the pipeline themselves.The new
Sequence
objects aims at replacing the oldProcessingChain
one.Usage
In:
Out:
TODO
compose.StateAtom
)compose.Sequence.execute
withprocessing.ProcessingChain.process
, including chunked processing and stateful operationsnumpy
and user-defined operationsxdas.signal
built-in operations