ococrook / RexMS

Residue-level analysis of HDX-MS
https://ococrook.github.io/RexMS/
6 stars 0 forks source link

MaxUptakes function does not work for new data #1

Closed nlgittens closed 2 months ago

nlgittens commented 2 months ago

Is the feature request related to a problem?

maxUptakes function does not appear to work for new data; it simply returns existing data in BRD4_apo$MaxUptake. I don't think it contains the correct calculation. It will also fail when data includes multiple charge states.

Describe the solution

Might be easier to calculate new column "MaxUptake" directly (and so no need to iterate over replicates either). The following function I wrote worked for me (still needed to be in an R data frame and not Bioconductor DataFrame object):

maxUptakes <- function(res){
  res %>%
    mutate(MaxUptake = str_count(str_sub(Sequence, 3, -1)) - str_count(str_sub(Sequence, 3, -1), "P"))
}
hdx_clean <- maxUptakes(hdx_clean)

Describe any alternatives considered

ococrook commented 2 months ago

ahh yes! The function just extact the column currently which is confusing:

maxUptakes <- function(res) {
  stopifnot("res must be a DataFrame" = is(res, "DFrame"))

    .out <- vapply(seq_along(unique(res$Sequence)),
        function(z) res$MaxUptake[res$Sequence == unique(res$Sequence)[z]][1],
        FUN.VALUE = numeric(1)
    )
    return(.out)
}

There is are two possible implementations but I like yours better as it doesnt explicitly rely on get the replication correct

rep(sapply(prepareIndexes(res = res), length), each = length(unique(res$Exposure)) * numRep)  ==  data.frame(res) %>%  mutate(MaxUptake = str_count(str_sub(Sequence, 3, -1)) - str_count(str_sub(Sequence, 3, -1), "P")) %>% pull(MaxUptake)
maxUptakes <- function(res) {
  stopifnot("res must be a DataFrame" = is(res, "DFrame"))

.out <-  data.frame(res) %>%  
                     mutate(MaxUptake = str_count(str_sub(Sequence, 3, -1)) - str_count(str_sub(Sequence, 3, -1), "P")) %>% 
                     pull(MaxUptake)
    return(.out)
}

I made some other improvements too so will change this if you're happy and push the bug fix

nlgittens commented 2 months ago

Yes, happy with this! You may also need to change the R markdown files to call the stringr library if you use this one

ococrook commented 2 months ago

Fixes and changes to other functions now pushed. The automatic build checker takes about 20 minutes so close the issue if you'r e happy