Closed jakebeal closed 2 years ago
👍
🤮
I dislike this suggestion with some passion. I dislike the wrapping we already do, and I'd rather not do any wrapping at all. I am absolutely and without question opposed to doing more wrapping.
Why? Because it is counterintuitive to set Component.types
to a string and then get back a list:
>>> c = sbol3.Component('http://example.com/issue301/c1', types=sbol3.SBO_DNA)
>>> c.types
['https://identifiers.org/SBO:0000251']
I think it is appropriate to treat these properties as the lists that they are:
>>> c.roles = [sbol3.SO_PROMOTER]
>>> c.roles
['https://identifiers.org/SO:0000167']
Hooray for TypeError:
>>> c.roles = sbol3.SO_PROMOTER
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/tmitchel/Projects/sd2/pySBOL3/sbol3/object.py", line 22, in __setattr__
self.__dict__[name].set(value)
File "/Users/tmitchel/Projects/sd2/pySBOL3/sbol3/property_base.py", line 177, in set
raise TypeError(msg)
TypeError: roles requires one or more values packed in an iterable
If the authors of the SBOL 3 specification wanted these to be singletons (i.e. strings), they would have defined these properties to be of type string. They did not do that though. In their infinite wisdom, they defined these to unbounded properties such that they can have more than one value.
One way or another, I think that it should be consistent, whether that means universal wrapping or universal not-wrapping.
Look at #324 for potential analog in solution.
We talked about pushing the auto-wrapping down into the handling of initial_value
in the various property constructors.
Care must be taken with strings because they are iterable
and not really a collection for our purposes. There are a limited number of instances of ListProperty
so this may be manageable. Typing may be a pain, we'll have to explore how we might make that work well.
@bbartley @jakebeal should the typing of the constructor arguments indicate that they can be passed a singleton, or is it ok for them to declare that they want a list, and accept a singleton if a singleton is passed (in violation of the typing declaration).
For example, the generated_by
property is a List of ReferencedObjects. It's new declaration will be:
generated_by: list[Union[Identified, str]] = None,
That indicates that the user can pass a list of items, each of which are either an Identified
or a str
. A user can still pass a single Identified
or a single str
and it will be marshaled into a list. But the declaration says it only wants a list.
Does that make sense, and is it ok with you?
I think this depends on whether we consider passing a singleton a "MAY" action or a "SHOULD NOT" action.
If it's a "MAY" action, then we'd want to have the typing indicate they can be passed as a singleton, so that one isn't being pushed away from it by warnings in the IDE.
If it's a "SHOULD" action, then we'd want to have the typing declare lists only, so that the user is chivvied away from the disliked behavior.
My personal preference is for "MAY", because there are a number of properties that allow a list, but for which the typical use case is a singleton.
If I'm reading between the lines, then I think the reason that @tcmitchell is asking is because the typing looks really hideous otherwise:
generated_by: Union[Union[Identified, str], list[Union[Identified, str]] = None
But here's an idea that might make it more palatable. Can we use Python's typing module to define a custom type? Such as:
generated_by: SingletonOrList[Union[Identified, str]] = None
It's not really about the typing, although I agree that it can get ugly. I also agree that there are ways to make it more palatable.
My concern is really the think I objected to [originally(https://github.com/SynBioDex/pySBOL3/issues/301#issuecomment-924141962). If we spread this far and wide I think it will be confusing to users. An example is useful:
>>> import sbol3
>>> sbol3.set_namespace('https://github.com/SynBioDex/pySBOL3/issues/301')
>>> seq = sbol3.Sequence('seq1', elements='GCAT')
>>> comp = sbol3.Component('c1', types=sbol3.SBO_DNA, sequences=seq)
>>> comp.types == sbol3.SBO_DNA
False
>>> comp.types
['https://identifiers.org/SBO:0000251']
>>> comp.sequences == seq
False
>>> comp.sequences
['https://github.com/SynBioDex/pySBOL3/issues/301/seq1']
In the above example, types
is set to a value, and sequences is set to a value. Yet when I compare the values afterwards, they do not match. Instead they are of a different type, a list
instead of a singleton. I find that confusing behavior.
I understand that for some frequently used constructions, like Component.types
it is desirable to advertise that a singleton is acceptable. I do not think we should spread this far and wide though. I'm fine with the code accepting these values, but advertising them is a bridge too far for me at the moment. I view this as SHOULD NOT. The programmer MAY set singleton values, and SHOULD NOT set singleton values.
If you can enumerate a set of values that should be advertised as allowing a singleton or a list, I would implement that. I am reluctant to advertise it everywhere. I think it will be too easy to abuse and to confuse.
>>> import sbol3
>>> sbol3.set_namespace('https://github.com/SynBioDex/pySBOL3/issues/301')
>>> seq = sbol3.Sequence('seq1', elements='GCAT')
>>> comp = sbol3.Component('c1', types=[sbol3.SBO_DNA], sequences=[seq])
>>> sbol3.SBO_DNA in comp.types
True
>>> comp.types == [sbol3.SBO_DNA]
True
>>> seq in comp.sequences
True
>>>
>>> # Surprising!
>>> comp.sequences == [seq]
False
>>>
>>> comp.sequences == [seq.identity]
True
I can see your point, @tcmitchell .
What do you think should be the distinction between arguments where we want to advertise flexibility, like Component.types
and ones where we definitely do not want to advertise flexibility, like Component.features
?
One reasonable possibility would be to just have me read through all of the properties and give a list of the ones I think we should advertise singletons for. I think that wouldn't actually be too big.
But maybe you have a more principled thought in mind?
I don't have a more principled thought in mind.
And upon further reflection it also occurred to me that if it is allowed, it should be typed as allowed. In other words, make the typing reflect the code. Why hide a feature we have implemented?
Sigh.
Another approach which is perhaps more principled is to introduce some specialized classes at the library level (as opposed to the spec level). So for example, we might have a Component
(singleton) vs MultisequenceComponent
(list); and Interaction
(singleton) vs MultitypeInteraction
(list). I actually rather like this, except for the fact that it is a slippery slope that ends up at MultisequenceMultitypeComponent
.
@bbartley I dislike your proposal.
@tcmitchell : I looked through and found the following multi-value properties are ones that I expect will have a singleton common case during construction:
Component.hasSequence
Component.role
Component.type
Component.hasModel
ExternallyDefined.type
Feature.role
Identified.wasDerivedFrom
Identified.wasGeneratedBy
Identified.hasMeasure
Interaction.type
LocalSubComponent.hasLocation
LocalSubComponent.type
Participation.role
SequenceFeature.hasLocation
SubComponent.hasLocation
SubComponent.sourceLocation
TopLevel.hasAttachment
VariableFeature.variantCollection
prov:Activity.type
prov:Association.hadRole
prov:Usage.hadRole
om:Measure.type
These multi-value properties, on the other hand, will generally not have singletons for their constructor calls:
Collection.member
CombinatorialDerivation.hasVariableFeature
Component.hasConstraint
Component.hasFeature
Component.hasInteraction
Component.hasConstraint
Experiment.member
Interaction.hasParticipation
Interface.input
Interface.nondirectional
Interface.output
VariableFeature.variantDerivation
VariableFeature.variantMeasure
VariableFeature.variant
prov:Activity.qualifiedUsage
prov:Activity.wasInformedBy
prov:Activity.qualifiedAssociation
om:Prefix.alternativeLabels
om:Prefix.alternativeSymbol
om:Unit.alternativeLabels
om:Unit.alternativeLabels
I would like to see singletons in the typing for the first list, as otherwise I will see many warnings in my code. I don't have a strong opinion about the second list.
Note to self: the following remain to be done on branch 301-list-wrapping:
Participation.role
SequenceFeature.hasLocation
SubComponent.hasLocation
SubComponent.sourceLocation
VariableFeature.variantCollection
prov:Activity.type
prov:Association.hadRole
prov:Usage.hadRole
om:Measure.type
Some of our constructors allow list-valued attributes to be specified with either a list or a singleton value that is then implicitly wrapped into a list. Others do not.
For example, with the constructor for
Component
, thetypes
argument has typeUnion[list[str], str]
, but withLocalSubComponent
, thetypes
argument has typelist[str]
.Can we make the implicit wrapping universal without it being too much of a pain?