SD2E / experimental-intent-parser

A tool that combines a word-processing interface with structured tables and assisted linking to definitions to provide a simple interface for incremental codification of experiment designs.
BSD 3-Clause "New" or "Revised" License
4 stars 0 forks source link

Breaking change in IP generation - cannot run experiments #319

Closed mwes closed 3 years ago

mwes commented 3 years ago

Context: We were seeing issues with IP generation for a set of new experiments today, with XPlan rejecting what was passed down to it. This was unusual, because the ER we were running was a copy of a previous ER that worked.

By re-running the IP on the previously working experiment, I discovered that the generated output has changed, breaking downstream tooling.

Example ER: https://docs.google.com/document/d/1CM6I_Agguz8cWJVmRiCMlnn47g-o4OdE0zJjIa30FRA

The output from IP for this ER is here: https://gitlab.sd2e.org/sd2program/cp-request/blob/master/input/structured_requests/CP_NovelChassis_Endogenous_Promoter_Blue_1_21.json

Summary of change:

Previously, each column of an ER had its values isolated in its own array. The current version appears to be collapsing them all together, losing the isolation of values between columns.

The snippets of JSON below summarize the changes. This is best seen in the CONDITION_SPACE row, which encodes multiple values per column, but it affects all of the other rows as well.

Previously:

    [
        {
            "name": {
                "label": "IPTG",
                "sbh_uri": "https://hub.sd2e.org/user/sd2e/design/IPTG/1"
            },
            "unit": "mM",
            "value": "0.0"
        },
        {
            "name": {
                "label": "IPTG",
                "sbh_uri": "https://hub.sd2e.org/user/sd2e/design/IPTG/1"
            },
            "unit": "mM",
            "value": "1.0"
        }
    ],
    [
        {
            "name": {
                "label": "Cuminic Acid",
                "sbh_uri": "https://hub.sd2e.org/user/sd2e/design/Cuminic0x20Acid/1"
            },
            "unit": "mM",
            "value": "0.0"
        },
        {
            "name": {
                "label": "Cuminic Acid",
                "sbh_uri": "https://hub.sd2e.org/user/sd2e/design/Cuminic0x20Acid/1"
            },
            "unit": "mM",
            "value": "1.0"
        }
    ],
    [
        {
            "name": {
                "label": "iptg_and_cuminic_acid",
                "sbh_uri": "https://hub.sd2e.org/user/sd2e/design/Cuminic0x20Acid0x200x2B0x20IPTG/1"
            },
            "unit": "mM",
            "value": "0.0"
        },
        {
            "name": {
                "label": "iptg_and_cuminic_acid",
                "sbh_uri": "https://hub.sd2e.org/user/sd2e/design/Cuminic0x20Acid0x200x2B0x20IPTG/1"
            },
            "unit": "mM",
            "value": "1.0"
        }
    ],
    [
        {
            "name": {
                "label": "media",
                "sbh_uri": "NO PROGRAM DICTIONARY ENTRY"
            },
            "value": "modified_m9_media_with_50mg_per_l_trp"
        }
    ],
    [
        {
            "name": {
                "label": "lab_id",
                "sbh_uri": "NO PROGRAM DICTIONARY ENTRY"
            },
            "value": "r1f833ub7xn6vb"
        },
        {
            "name": {
                "label": "lab_id",
                "sbh_uri": "NO PROGRAM DICTIONARY ENTRY"
            },
            "value": "r1f833szm3e3a6"
        }
    ],
    [
        {
            "name": {
                "label": "column_id",
                "sbh_uri": "NO PROGRAM DICTIONARY ENTRY"
            },
            "value": 1
        },
        {
            "name": {
                "label": "column_id",
                "sbh_uri": "NO PROGRAM DICTIONARY ENTRY"
            },
            "value": 12
        }
    ]
],

Note that each unique column (IPTG, Cuminic Acid, iptg_and_cuminic_acid, media, lab_id, column_id) is separated out into its own array, with any value permutations expressed within that array.

Now:

"contents": [
    [{
        "name": {
            "label": "column_id",
            "sbh_uri": "NO PROGRAM DICTIONARY ENTRY"
        },
        "value": 1
    }, {
        "name": {
            "label": "column_id",
            "sbh_uri": "NO PROGRAM DICTIONARY ENTRY"
        },
        "value": 12
    }],
    [{
        "name": {
            "label": "lab_id",
            "sbh_uri": "NO PROGRAM DICTIONARY ENTRY"
        },
        "value": "r1f833ub7xn6vb"
    }, {
        "name": {
            "label": "lab_id",
            "sbh_uri": "NO PROGRAM DICTIONARY ENTRY"
        },
        "value": "r1f833szm3e3a6"
    }],
    [{
        "name": {
            "label": "IPTG",
            "sbh_uri": "https://hub.sd2e.org/user/sd2e/design/IPTG/1"
        },
        "value": "0.0",
        "unit": "mM"
    }, {
        "name": {
            "label": "IPTG",
            "sbh_uri": "https://hub.sd2e.org/user/sd2e/design/IPTG/1"
        },
        "value": "1.0",
        "unit": "mM"
    }, {
        "name": {
            "label": "Cuminic Acid",
            "sbh_uri": "https://hub.sd2e.org/user/sd2e/design/Cuminic0x20Acid/1"
        },
        "value": "0.0",
        "unit": "mM"
    }, {
        "name": {
            "label": "Cuminic Acid",
            "sbh_uri": "https://hub.sd2e.org/user/sd2e/design/Cuminic0x20Acid/1"
        },
        "value": "1.0",
        "unit": "mM"
    }, {
        "name": {
            "label": "iptg_and_cuminic_acid",
            "sbh_uri": "https://hub.sd2e.org/user/sd2e/design/Cuminic0x20Acid0x200x2B0x20IPTG/1"
        },
        "value": "0.0",
        "unit": "mM"
    }, {
        "name": {
            "label": "iptg_and_cuminic_acid",
            "sbh_uri": "https://hub.sd2e.org/user/sd2e/design/Cuminic0x20Acid0x200x2B0x20IPTG/1"
        },
        "value": "1.0",
        "unit": "mM"
    }],
    [{
        "name": {
            "label": "media",
            "sbh_uri": "NO PROGRAM DICTIONARY ENTRY"
        },
        "value": "modified_m9_media_with_50mg_per_l_trp"
    }]
]
}]
}]

In this version, media, lab_id, and column_id are kept separate. However, the inducers: IPTG, Cuminic Acid, iptg_and_cuminic_acid)are all combined into a single array, with the values expressed across the array without isolation. Our tooling can't make sense of this - it's like the columns have all bled together.

How can we resolve? This looks like a breaking change. We need to ensure that IP generation is stable/consistent between releases. If breaking changes are expected, ensure they are communicated and accounted for before releasing. ex. did the golden file tests fail here? If they did not, we are definitely missing some coverage here. This ER would be a good one to add to the list. I am happy to help assist and debug with fixing this. Thanks!