Issue #99 proposal - Githubissues

thirtytwobits commented 5 months ago

This is a draft PR to review a proposed change to the public APIs for pydsdl.

After playing with a Nunavut integration I realized the previous design was deficient.

~This is non-breaking change that adds an optional named tuple to the read_namespace method. This named tuple allows specifying a set of filtering rules to apply when discovering DSDL types from files under a given namespace filetree. The intent is to connect the proposed TypeFilterEngine to the _construct_dsdl_definitions_from_namespace method after it creates a DSDLDefinition to apply the initial filtering of the target type. From there I need to figure out how to mark that selected type and its transitive dependencies as either weakly-excluded or strongly-included and detect any conflict where user provided rules conflict as processing proceeds. Any advice as to where I could best work with a tree-representation ahead of any expensive processing would be appreciated.~

The new proposal uses a visitor pattern that is more powerful and will allow Nunavut to fully describe all file dependencies closing https://github.com/OpenCyphal/nunavut/issues/58 and allowing the #99 filter to be implemented in nunavut.

pavel-kirienko commented 5 months ago

I want to postpone reviewing this until this discussion is concluded: https://forum.opencyphal.org/t/proposal-for-a-type-manifest-specification/2107/6

One point that is probably invariant to the conclusion is that I would like to bring PyDSDL into the context of processing DSDL manifests, since manifests are closely related to the problem domain of PyDSDL itself. PyDSDL shouldn't be involved in parsing the manifests but it should probably accept some abstract representation of the manifest at the input (optionally, of course, as we don't want it to be a breaking change).

Also, I would welcome a more functional-style design, if there are such options on the table.

thirtytwobits commented 5 months ago

This is separable from the manifest format. pydsdl needs to support the concepts of target types instead of target namespaces and transitive closures for those target types across namespaces. For example: if

# TARGET                       |   DEPENDENCY               
root_ns_b/d/FType.1.0 -> root_ns_a/b/GType.1.0
root_ns_b/d/HType.1.0 -> root_ns_b/b/FType.1.0
root_ns_a/b/CType.1.0 -> root_ns_b/d/EType.1.0

The current API requires two calls, one with a target NS of root_ns_b that produces one tree and then another root_ns_a target NS that produces a second tree one has to merge with the previous tree. I guess I could design a new API but the old one cannot be retrofitted to support this in a single call.

As for "functional" can you explain? The dependencies are found by parser events so it seems natural to support dependency discovery events.

pavel-kirienko commented 5 months ago

This is separable from the manifest format. pydsdl needs to support the concepts of target types instead of target namespaces and transitive closures for those target types across namespaces.

I agree.

What I was trying to communicate is that exposing a generic visitor-based API works but the utility of this solution is not very high because it requires the user to implement much of the manifest processing manually. It will be implemented in Nunavut and then again in other software involved in the processing of DSDL definitions; I would like to avoid such duplication. Hence, I am trying to say that the concerns related to the selection of data types based on user-provided requirements should be addressed within PyDSDL instead of being pushed up the stack via the abstract visitors.

I guess I could design a new API but the old one cannot be retrofitted to support this in a single call.

Can we explore the idea of adding a new top-level entry point beside read_namespace that accepts an explicit list of types of interest -- closely following the manifest model (note that I am not talking about the manifest format here) -- and the set of root namespace directories to look for them in?

The existing read_namespace may or may not be marked as deprecated.

thirtytwobits commented 5 months ago

I've updated the proposal. Python docs aren't fixed up nor is there testing. This is just the first thwack to see if we're aligned. Please review _namespace.py::read_files (again, I didn't fix the docs sorry). Example usage would be:

target_types = [
    Path("/path/to/root/taxonomy/animalia/chordata/mammalia/artiodactyla/suidae/sus/sus_domesticus.1.0.dsdl"),
    Path("/another_path/to/roots/taxonomy/animalia/chordata/mammalia/artiodactyla/suidae/sus/sus_scrofa.1.0.dsdl"),
    Path("/path/to/root/plantae/tracheophytes/gymnospermae/pinophyta/pinopsida/cupressales/cupressaceae/sequoia/sequoia_sempervirens.1.0.dsdl")
]
target_dsdl, dependency_dsdl = pydsdl.read_files(target_types, ["animalia", "plantae"], path_to_public_types)

emrainey commented 5 months ago

I'm a little confused by the example, am I getting the same set of files (or types or ??) in target_dsdl that I put into target_types (which are just file paths, not types)? It essentially discovers all the dependency_dsdl that I have amongst the provided namespaces and roots paths for that particular set of target_dsdl?

thirtytwobits commented 5 months ago

I'm a little confused by the example, am I getting the same set of files (or types or ??) in target_dsdl that I put into target_types (which are just file paths, not types)? It essentially discovers all the dependency_dsdl that I have amongst the provided namespaces and roots paths for that particular set of target_dsdl?

Right. you get a tuple where the target_dsdl "files" are abstractions that provide the CompositeType as a getter associated with the metadata about the file used to create that type. This is 1:1 for the target types passed into the method. The second member of the tuple is the same but contains all dependent types discovered using the lookup paths and needed by the transitive closure of the target set.

thirtytwobits commented 5 months ago

I would like to avoid exposing anything but CompositeType to the client; we can add new properties to that class for the missing context information.

Please look at the latest proposal. It only exposes CompositeType but adds properties to it.

OpenCyphal / pydsdl

Issue #99 proposal #101