ga4gh / va-spec

An information model for representing variant annotations.
14 stars 2 forks source link

Remove abstract intermediate Statement subclasses from core-im? #138

Open mbrush opened 1 month ago

mbrush commented 1 month ago

The core im source file includes a few Statement specializations, which IMO are not necessary and may introduce confusion:

I think the rationale for including these may have been that they abstract out properties that are common to a set of descendant Statement classes. But IMO, unless a project wants to actually implement data as instances of these classes, I would prefer not to create them. There are so few properties in each, that it would IMO be better/easier to just define these directly on each Statement subclass.

ClnGen/VICC folks may have a different opinion, so we should discuss utility of these, and if they are really needed.

If we do end up deciding to keep these abstract classes, consider where they belong (IMO not in Core IM . . . perhaps as separate Profiles? Or in some abstract middle layer between core im and profiles?)

larrybabb commented 3 weeks ago

@mbrush I think there is more value here than you appreciate. @ahwagner did the work in getting the current profiles ready for implementation so that we could move forward in a more efficient manner. I will defer to @ahwagner to articulate the need for these and we can discuss how we can make "extended abstractions" between the core-im layers and the "standard profile" layers when/if they are needed for easing implementations.

of course if @ahwagner is fine with replicating the attributes and schema constraints in this middle layers in every sub-profile then we can go that way too.

larrybabb commented 2 weeks ago

This one definitely requires input from @ahwagner .

ahwagner commented 2 weeks ago

These data classes are useful for avoiding errors from repeatedly defining common properties (the DRY principle). Even though the number of properties are few, they do exist, and are reused across multiple subclasses. Doing things this way will help reduce errors as the spec evolves.

larrybabb commented 2 days ago

@mbrush and @ahwagner in the spirit of failing-fast. I removed these 3 intermediate classes and applied those changes directly to each standard profile that used them. There were only 5 and they did deviate slightly in any case (which is worthy of discussion). In the end I think we should make reasonable decisions about when to apply the DRY principle to at this level. I believe these intermediate classes are a potential warning flag that something may not be modeled properly at a higher level. In the end if we agree that these (or some form of these) are useful and should be re-applied, I will do it. I also think we should decide whether they should live in the profiles-source.yaml file or the core-im-source.yaml file.

mbrush commented 1 day ago

I am fine with this decision Larry.

As you say, if we do bring them back, we would need to consider what file they should live in. My comment in #139 is relevant here, and considers this question in a world where we do vs do not merge all standard profiles into a single 'profiles-source.yaml' file. (I prefer not merging for reasons laid out in #139)