VEuPathDB / service-eda-compute

Apache License 2.0
0 stars 0 forks source link

Add continuous variables to differential abundance #62

Closed asizemore closed 1 year ago

asizemore commented 1 year ago

Requires microbiomeComputations feature-41-diff-abund-cont-vars Requires veupathUtils add-bin-helpers

Updates the differential abundance api to accept the comparator as an object with a variable and two bin lists.

Notes:

  1. I duplicated BinRange from the data service. There was LabeledRange in EdaCommon but that felt like more a data range. We should consolidate these in a future PR
  2. I duplicated getRBinListAsString from the data service in OverlaySpecification.java. My duplicated code is again duplicated so my addition is utterly atrocious. I considered extending the BinRange class so i could add this as a method, but the best fix would add something to EdaCommon and then update the data service and this diff abund file. Anyways, given the timing and that this PR is at least functional and has api we desire, i think it's worth a review and then a discussion about if implementing an extended BinRange class is worth the effort now if we know we need to do some edacommon work within the next build or so.

Example post request

{
    "config": {
        "collectionVariable": {
            "variableId": "EUPATH_0009254",
            "entityId": "OBI_0002623"
        },
        "differentialAbundanceMethod": "DESeq",
        "comparator": {
            "variable": {
                "entityId": "EUPATH_0000096",
                "variableId": "EUPATH_0000639"
            },
            "groupA": [
                {"binLabel": "Antibiotics cohort"}
            ],
            "groupB": [
                {"binLabel": "Three country cohort (Karelia)"},
                {"binLabel": "Type I Diabetes (T1D) cohort"}
            ]
        }
    },
    "derivedVariables": [],
    "filters": [],
    "studyId": "DiabImmune-1"
}
d-callan commented 1 year ago

I duplicated BinRange from the data service. There was LabeledRange in EdaCommon but that felt like more a data range. We should consolidate these in a future PR

LabeledRange is the future. you should use that.

We should also add to EdaCommon a util getRBinListAsString that takes List<LabeledRange> and returns the string R needs. we can also override that to take List<LabeledRangeWithValue>, List<Range> and List<String> etc..

asizemore commented 1 year ago

From slack: the LabeledRange in eda common won't be out until the release most likely, so we need to add a local version. i'll do that first thing tomorrow!