Opentrons / opentrons

Software for writing protocols and running them on the Opentrons Flex and Opentrons OT-2
https://opentrons.com
Apache License 2.0
429 stars 180 forks source link

RFC: Labware v3 Schema #4601

Closed IanLondon closed 2 years ago

IanLondon commented 4 years ago

overview

Let's use this RFC as the main place to gather & keep track of the changes we want to see for the next labware definition schema, version 3!

aside: labware instance versioning

NOTE: This RFC is for a new labware def schema, and addresses how labware instance versions relate to schema bumps. For a broader proposal outlining user experiences around labware instance versioning (that is outside the narrow scope of this RFC), see this doc instead:

https://docs.google.com/document/d/1egl3LBoLNDyNLn0D-rykEFeF7zjtiozYI5C_S9gMHS4/edit?usp=sharing

rough list

  1. add schema identifier, following convention that was set in the modules schema. Eg "$otSharedSchema": "#/labware/schemas/3"
  2. magnetEngageHeight values updated to mm from labware bottom, instead of mm from home position. (Also: allow negative numbers here, in v2 this is positive only)
  3. place to put arbitrary keys to support "future" data? Whether to do this or not is still an open question. To over-summarize the discussion we had: it's nice to allow escape hatches so we don't get blocked when adding new features, but there's also a risk of working too far outside the schema and then losing the advantage of having a schema at all.
  4. Change summary, explaining to users why a new version of a definition was published and what was wrong with the previous version. Eg "changes": "Fixed incorrect well spacing for all wells."

??? More possibilities ???

Side Note: Multi-schema support

For labware v2 project, we migrated all the v1 definitions that we cared about to v2, and now consider v1 definitions as legacy.

In contrast, when we decide to make labware v3, we may actively support many v2 labware definitions we have and not have any inherent need to migrate them over to v3, but instead would make a v3 as a labware-version bump (eg, we have a labware definition called "Opentrons Plate" at version: 1 in schema v2 and we want to make corrections or take advantage of schema v3 features, so we publish a new "Opentrons Plate" at version: 2 that is in schema v3). PD, Python, and Labware Library could all work with labware defs that could either be v2 or v3. To the user, the schema difference is transparent, they would just care about the labware version eg labware.load('some-labware', version=2, ...)

To implement this in JS world, we could either:

  1. On acquiring labware definitions from shared-data/labware, JS apps could auto-migrate any required labware that is still in schema v2 to schema v3. This would handle missing/changed keys. Then apps don't have to worry about schema v2. But, any direct access to labware attributes needs to be updated to accommodate any changes to labware schema, and that problem persists every time we bump the schema.
  2. make sure we're always using labware getters and not directly accessing labware attributes. These getters would accept all supported labware schemas, eg getLabwareDisplayName(def: LabwareDefV2 | LabwareDefV3). So to the consumer application, labware schema wouldn't matter

??? More possibilities ???

sfoster1 commented 4 years ago

I'm in favor of multi-schema support. It won't be that hard to add to the python side since we do a bunch of work deserializing the defs anyway and putting them into python objects, so the deserialization side is the only place we need to add it. I think we should plan to do this.

For the schema identifier schemaIdentifier sounds great.

IanLondon commented 4 years ago

From a Slack convo w Seth & Laura:

5085 reveals some problems with format and with 8-channel access in general. format is used mainly (exclusively?) to determine 8-channel access. It's very coarse-grained view of a labware, and we would like to deprecate it.

Also, we should probably bake 8-channel access into the definition itself, because if we interpret the def geometry in different versions of different software (eg API, PD, maybe RA), we can wind up with drifting results.

Maybe to solve #5085, LC should use PD's 8-channel geometry utils to figure out what valid wells there are for 8-channel pipettes? IDK if groups is the right thing for that or not, I can't picture how to do it with groups bc groups could also mean there are different grids, none of which are compatible with multi-channel

Just brainstorming: if in the future all defs had 8-channel interpretations baked into them: eg for a 384, multiChannelAccess: {A1 : [A1, C1, D1, ...], B1: [B1, D1, E1, ...], ...} -- where the key is the well that channel 1 goes into and the array is the 8 wells that make up the row -- then PD wouldn't even need to do its own geometry checks for liquid tracking... which would be very nice, if the definition is indeed correct.

Perhaps also troughs could then do multiChannelAccess: {"A1": "A1"} signifying "all 8 channels fit into this well" and implying the center multichannel quirk on a per-well basis.

That's pretty powerful. Taking that further, that would support stuff like {A1 : [A1, B1, C1, ...], A2: A2} where you have a valid 8-channel position at A1 which goes into an 8-well row, and then another valid 8-channel position in the same labware that is trough-like:

         __
(  )    |   |
(  )    |   |
(  )    |   |
(  )    |   |
(  )    |   |
(  )    |   |
(  )    |   |
(  )    |   |

and if you have wells that don't allow 8-channel access, we omit them from the multiChannelAccess dictionary.

SyntaxColoring commented 3 years ago

@IanLondon says: Just brainstorming: if in the future all defs had 8-channel interpretations baked into them...

Complication: what happens if we add new multi-channel pipettes later, like a 96-channel?

Existing labware definitions would have 8-channel interpretations baked-in, but not 96-channel interpretations baked-in. So they would work with 8-channel pipettes, but not 96-channel pipettes.

@mcous's opinion:

(I think, based on a video call earlier today)

My uninformed opinion:

mcous commented 3 years ago
  • Because of this complication, we shouldn't bake multi-channel interpretations into labware definitions.
    • We should duplicate the interpretation logic across Protocol Designer and the Python Protocol API. The logic won't be that bad.

This is a good summary of my thoughts. IMO, the more we jam into labware definitions that isn't "information about this labware in isolation", the more problems we will have (see magneticModuleEngageHeight).

"Which wells will the pipette's tips be in when accessing this labware?" can be answered with pure (organic, free-range, artisanal) functions, based solely on location and geometry data that we already have, that we can fully cover with unit tests. For me, "the robot should know exactly what it's doing from a liquid handling perspective" far outweighs any concerns about logic duplication that we can ensure remains aligned through test automation.

I would like to get to a point where, at an engine level, you specify all the wells you're trying to interact with with a given multi-channel pipette.

sfoster1 commented 3 years ago

But then if you have a problem with where your pipettes are ending up, it's temporally removed from the place you can actually change it. And it doesn't let you carve out semantic exceptions on a def level to some of these behaviors (i.e. "oh god these wells are different depths I shouldn't try to aspirate from them at the same time") or something.

I get wanting to only base decisions on geometric data, but I think we can never get fully away from it because we bake in concepts like "what is a row" and "what is a column" and "what are semantically-separate groups" (i.e. for those compound tube racks that handle 50mL and 20mL tubes).

I do think it's not good to have to bake in 8channel access and 96channel access and so on and so forth, though.