Closed jgladwig closed 3 years ago
This is becoming a blocker for @jgladwig now
Note that this might split into multiple issues; @jgladwig it would be good if you can note which bits are the blockers and which can be split off.
I think I can break this into two parts. They are both blockers related to the opil IP generates and are both required for the end goal of using opil from start to finish. But I think the first part can at least get us opil output (prior to supporting opil input for xplan).
This is a blocker for getting output from a opil to strateos json converter. Note that doing this part without also doing part 2 would still leave us dependent on structured requests for input but it will at least allow me to flesh out the output side. In particular it will clear the path for me to write the opil to strateos json converter and to build unit tests to ensure that the new output matches our previous output.
The heart of the issue here is that I would like to shift the content of the opil returned by Intent Parser to be as close as possible to its completed form.
Right now that appears to mean:
Ensuring the ProtocolInterface within the opil returned from generateOpilRequest is the complete interface Right now this is what I get from the strateos to opil generator
ns2:GrowthCurve a ns1:ProtocolInterface,
sbol:TopLevel ;
ns1:hasParameter <http://strateos.com/GrowthCurve/BooleanParameter1>,
<http://strateos.com/GrowthCurve/EnumeratedParameter1>,
<http://strateos.com/GrowthCurve/EnumeratedParameter2>,
<http://strateos.com/GrowthCurve/EnumeratedParameter3>,
<http://strateos.com/GrowthCurve/EnumeratedParameter4>,
<http://strateos.com/GrowthCurve/MeasureParameter1>,
<http://strateos.com/GrowthCurve/MeasureParameter10>,
<http://strateos.com/GrowthCurve/MeasureParameter11>,
<http://strateos.com/GrowthCurve/MeasureParameter12>,
<http://strateos.com/GrowthCurve/MeasureParameter2>,
<http://strateos.com/GrowthCurve/MeasureParameter3>,
<http://strateos.com/GrowthCurve/MeasureParameter4>,
<http://strateos.com/GrowthCurve/MeasureParameter5>,
<http://strateos.com/GrowthCurve/MeasureParameter6>,
<http://strateos.com/GrowthCurve/MeasureParameter7>,
<http://strateos.com/GrowthCurve/MeasureParameter8>,
<http://strateos.com/GrowthCurve/MeasureParameter9>,
<http://strateos.com/GrowthCurve/StringParameter1>,
<http://strateos.com/GrowthCurve/StringParameter2>,
<http://strateos.com/GrowthCurve/StringParameter3>,
<http://strateos.com/GrowthCurve/StringParameter4>,
<http://strateos.com/GrowthCurve/StringParameter5> ;
sbol:displayId "GrowthCurve" ;
sbol:name "GrowthCurve" ;
ns2:strateos_id "pr1e955a2zbtw65" .
But this is what I get from the Intent Parser generateOpilRequest:
ns2:GrowthCurve a ns1:ProtocolInterface,
sbol:TopLevel ;
ns1:hasParameter <http://strateos.com/GrowthCurve/BooleanParameter1>,
<http://strateos.com/GrowthCurve/BooleanParameter2>,
<http://strateos.com/GrowthCurve/IntegerParameter1>,
<http://strateos.com/GrowthCurve/IntegerParameter2>,
<http://strateos.com/GrowthCurve/MeasureParameter13>,
<http://strateos.com/GrowthCurve/MeasureParameter14>,
<http://strateos.com/GrowthCurve/MeasureParameter15>,
<http://strateos.com/GrowthCurve/MeasureParameter16>,
<http://strateos.com/GrowthCurve/MeasureParameter17>,
<http://strateos.com/GrowthCurve/MeasureParameter18>,
<http://strateos.com/GrowthCurve/MeasureParameter19>,
<http://strateos.com/GrowthCurve/MeasureParameter20>,
<http://strateos.com/GrowthCurve/MeasureParameter21>,
<http://strateos.com/GrowthCurve/StringParameter1>,
<http://strateos.com/GrowthCurve/StringParameter2>,
<http://strateos.com/GrowthCurve/StringParameter3>,
<http://strateos.com/GrowthCurve/StringParameter4>,
<http://strateos.com/GrowthCurve/StringParameter5>,
<http://strateos.com/GrowthCurve/StringParameter6>,
<http://strateos.com/GrowthCurve/StringParameter7>,
<http://strateos.com/GrowthCurve/StringParameter8>,
<http://strateos.com/GrowthCurve/StringParameter9> ;
ns1:protocolMeasurementType <http://strateos.com/GrowthCurve/MeasurementType1>,
<http://strateos.com/GrowthCurve/MeasurementType2>,
<http://strateos.com/GrowthCurve/MeasurementType3>,
<http://strateos.com/GrowthCurve/MeasurementType4>,
<http://strateos.com/GrowthCurve/MeasurementType5> ;
sbol:displayId "GrowthCurve" ;
sbol:name "GrowthCurve" ;
ns2:strateos_id "pr1e955a2zbtw65" .
As you can see there are some missing parameters in the definition from IP (and what looks like some extra parameters).
So ultimately I am hoping we can determine where the mismatch is coming from and resolve it. If my understanding is correct then resolving this will result in a clearer form of 'blanks' in the ExperimentalRequest (which are parameters described in the ProtocolInterface but not yet filled in with values on the ExperimentalRequest object). If we get that far then I can use the data xplan currently has (relying on the structured request as a stop gap until part 2 is done) to fill in the blanks.
Ensuring the opil ExperimentalRequest returned by generateOpilRequest is correctly formed. I suspect that there are some minor bugs in the building of the request. The request I get currently looks like this:
<https://sd2e.org/ip3a4b5a12e48847e6ad834888245f4f74> a ns1:ExperimentalRequest,
sbol:TopLevel ;
ns1:hasMeasurement <https://sd2e.org/ip3a4b5a12e48847e6ad834888245f4f74/Measurement1>,
<https://sd2e.org/ip3a4b5a12e48847e6ad834888245f4f74/Measurement2>,
<https://sd2e.org/ip3a4b5a12e48847e6ad834888245f4f74/Measurement3>,
<https://sd2e.org/ip3a4b5a12e48847e6ad834888245f4f74/Measurement4>,
<https://sd2e.org/ip3a4b5a12e48847e6ad834888245f4f74/Measurement5> ;
ns1:hasParameterValue <http://strateos.com/GrowthCurve/BooleanParameter1/BooleanValue2>,
<http://strateos.com/GrowthCurve/BooleanParameter2/BooleanValue3>,
<http://strateos.com/GrowthCurve/IntegerParameter1/IntegerValue2>,
<http://strateos.com/GrowthCurve/IntegerParameter2/IntegerValue3>,
<http://strateos.com/GrowthCurve/StringParameter1/StringValue2>,
<http://strateos.com/GrowthCurve/StringParameter2/StringValue3>,
<http://strateos.com/GrowthCurve/StringParameter3/StringValue4>,
<http://strateos.com/GrowthCurve/StringParameter4/StringValue5>,
<http://strateos.com/GrowthCurve/StringParameter5/StringValue6>,
<http://strateos.com/GrowthCurve/StringParameter6/StringValue7>,
<http://strateos.com/GrowthCurve/StringParameter7/StringValue8>,
<http://strateos.com/GrowthCurve/StringParameter8/StringValue9>,
<http://strateos.com/GrowthCurve/StringParameter9/StringValue10>,
<https://sd2e.org/ip3a4b5a12e48847e6ad834888245f4f74/MeasureValue10>,
<https://sd2e.org/ip3a4b5a12e48847e6ad834888245f4f74/MeasureValue2>,
<https://sd2e.org/ip3a4b5a12e48847e6ad834888245f4f74/MeasureValue3>,
<https://sd2e.org/ip3a4b5a12e48847e6ad834888245f4f74/MeasureValue4>,
<https://sd2e.org/ip3a4b5a12e48847e6ad834888245f4f74/MeasureValue5>,
<https://sd2e.org/ip3a4b5a12e48847e6ad834888245f4f74/MeasureValue6>,
<https://sd2e.org/ip3a4b5a12e48847e6ad834888245f4f74/MeasureValue7>,
<https://sd2e.org/ip3a4b5a12e48847e6ad834888245f4f74/MeasureValue8>,
<https://sd2e.org/ip3a4b5a12e48847e6ad834888245f4f74/MeasureValue9> ;
sbol:displayId "ip3a4b5a12e48847e6ad834888245f4f74" ;
sbol:name "Experimental Result"
The URIs seem wrong to me in the above but I am unclear on exactly what is wrong. Fixing this is important because this is ultimately the object that I will be writing to in order to 'fill in the blanks'.
The second is a blocker for having xplan accept opil as input.
This is an extension of part 1. In particular the goal will be to reduce the number of blanks xplan fills in to the minimal required set (so only the values xplan makes any decisions on). Anything that is known in advance would be already filled in in the opil returned by generateOpilRequest. See the original issue contents for the list of missing data (at least for GrowthCurve).
If we resolve this as well then I expect I can begin to adjust the input of xplan to process the opil from Intent Parser directly (remove the need for structured requests).
Note that there are some additions @danbryce may have in regards to the condition space. I think those will primarily deal with Part 2
and the shift to accepting opil as input to xplan.
Here are the full opil files for the above: opil_from_protocol_interface.txt opil_from_intent_parser.txt
Here are some examples of what I expect for the condition space inputs. Look at the key condition_space
. It has several factors
. Each factor is a column in the ER. A factor has a name, domain, domain type (dtype), object type (otype), factor type (ftype), and optional ‘lab_name’, ‘lab_prefix’, and ‘lab_suffix’. The optional items are for mapping the factor to a Strateos parameter.
Riboswitches:
Growth Curve:
Time Series:
Obstacle Course:
On Feb 24, 2021, at 2:36 PM, jgladwig notifications@github.com wrote:
Here are the full opil files for the above: opil_from_protocol_interface.txt opil_from_intent_parser.txt
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.
@jgladwig , @jakebeal I have broken this issue down to the following sub issues:
plate_reader_info.fluor_em
, plate_reader_info.fluor_ex
, inoc_info.inc_time_1
, and inoc_info.inoculation_media
are already supported in Intent Parser. They appear as optional parameter so users will need to manually specify this information onto Intent Parser's parameter table.I will need to distinguish what extra parameters looks like on @jgladwig end. Intent Parser generates 12 additional parameter fields that are used for running an experiment. If these fields matches the fields that @jgladwig are identifying, then they are valid parameters that xplan should reason about. Otherwise, I will need the name of these parameters and follow up with a bug report.
I will need clarification on what URIs seems wrong in the above example.
To point to a specific instance of the potential error... I see these two values in the ExperimentalRequest I pasted in my previous comment (under hasParameterValue):
...
<http://strateos.com/GrowthCurve/StringParameter9/StringValue10>,
<https://sd2e.org/ip3a4b5a12e48847e6ad834888245f4f74/MeasureValue10>,
...
To me the value https://sd2e.org/ip3a4b5a12e48847e6ad834888245f4f74/MeasureValue10
reads as correct because it is a value that is owned by the https://sd2e.org/ip3a4b5a12e48847e6ad834888245f4f74
ExperimentalRequest.
In short, the base URI matches so I expect the value to be contained within that specific ExperimentalRequest and have no unexpected links out to other objects.
Conversely the value http://strateos.com/GrowthCurve/StringParameter9/StringValue10
reads as incorrect (to me) because it reads as though the value is owned by the http://strateos.com/GrowthCurve
ProtocolInterface when I am expecting it to be owned by the https://sd2e.org/ip3a4b5a12e48847e6ad834888245f4f74
ExperimentalRequest.
My worry here is if we ever are dealing with documents with multiple ExperimentalRequests defined within (like we are talking about in this issue https://github.com/SD2E/opil/issues/142). The parameter value URIs that reference the http://strateos.com/GrowthCurve
Protocol interface are not unique to the ExperimentalRequest that contains them. So if someone were to decide to change the value of one of these URIs (that point back to the ProtocolInterface) in only one experimental request then that change would propagate to all other experimental requests that also reference that object within the document.
In short, the URIs in the example I am calling 'potentially wrong' suggest that there is some extra object linkage and that not all parameter values in the ExperimentalRequest are owned by that ExperimentalRequest.
Perhaps this is intentional behavior. It was just something I noticed while reading through the files and it brought questions to mind as to what those URIs were implying.
Hopefully the above helps explain my question/concern here. And note that I ultimately I defer to @jakebeal for this as I am only listing my expectations of what the sbol doc will look like for an experimental request. My expectations may be wrong.
Good eyes, @jgladwig : http://strateos.com/GrowthCurve/StringParameter9/StringValue10
is indeed a potential problem.
What it looks like to me is that the default parameters are all getting referenced from the ExecutionRequest rather than copied into the ExecutionRequest. They should be copied over, making them child objects of the ExecutionRequest.
Because these are write-only values, this would likely not actually cause any errors. It's not the intended usage though, per the specification, and would be good to adjust to match the specification.
Closing this in favor of the broken out sub-issues.
This issue is a collection of related adjustments I am looking for in how Intent Parser produces its OPIL response to a
generateOpilRequest
. It is a bit long winded but the end goal here is to allow for XPlan to directly accept the OPIL document as input (replacing the structured requests). Its possible this will break out into multiple issues but I have not yet untangled it that far.Note that all of the examples below are referencing the same instance of a Strateos GrowthCurve protocol and that it is very possible we need to consider how this data flow will work on other protocols and labs (and also note that I will be abusing the github syntax highlighting for diffs to reference some lines in the examples below).
Structured Requests
To begin, immediately below is a version of the structured request we are getting (though I have hidden the
condition_space
andconditions
fields to avoid reasoning about 50k lines). My focus is going to be on the properties that I know need to pass through XPlan and appear in the final OPIL output. My hope is that @danbryce can include additional information to describe what further information we need to successfully execute XPlanWhat XPlan currently receives
Below is the dotname data we receive from Intent Parser via
generateOpilRequest
. So this is the information in the OPIL document that arrives with a dotname annotation present. I read through the document object by object and anything that is marked with a dotname is converted into the JSON equivalent form.In short this is the beginnings of OPIL to Strateos JSON converter.
Here is what we current receive. Note that the gains appear to be unset.
What XPlan has been outputting
Below is what XPlan has been outputting (prior to any use of OPIL). The lines marked with a
+
are data that XPlan is adding. (Note that I have also hidden thesrc_samples
field to keep the line count down.)@danbryce may want to double check but from what I am seeing all of the data below that is not marked with a
+
is data that has passed through XPlan and is actually sourced from the original structured request (as seen above).Missing data
The fields that are missing entirely (in dotname form):
The field that are present with wrong data:
How do we resolve this?
More context: After some mild digging it appears that all of the above missing parameters (not the data itself) are described within the Strateos GrowthCurve protocol. This means that the
ProtocolInterface
that is generated from the Strateos to OPIL converter contains aParameter
for each of the above missing data fields. Note that theseParameters
are not currently available in the OPIL pulled from Intent Parser. In other words not only are theParameterValues
not present but theParameters
are also not present.There are probably multiple ways forward from here but I am hoping to get more eyes on this to reason about how we tie this all together.
So some of the potential paths forward are:
This will remove any guesswork around what portions of the ProtocolInterface are present.
ExperimentalRequest
in the IP OPIL (asParameterValues
).Note: I am not certain this is a simple as just adding in the above fields. There may be some different needs based on different labs and protocols.
Beyond that I think there will be some additional needs that @danbryce can identify for requirements to fully convert to using OPIL as direct input to XPlan instead of structured requests.