Closed wkumler closed 1 year ago
The fact that it's off by such a wild amount implies that this is an encoding issue of some kind. I've seen enormous exponents like that before when doing all the fancy compression stuff incorrectly so that's where I suspect the bug is. The problem is that I don't know whether it's on the encoding end (in tmzMLmaker) or the decoding end (in msdata_connection nonsense). It doesn't seem like either location has an obvious bug - both getEncoded and giveEncoding use compression_type = "gzip", bin_precision = 8, and endi_enc = "little". Which was intentional when designing this so that I wouldn't have exactly this issue. I wonder if it has anything to do with "voltage" being the only one that's encoded as an integer rather than a float, and thus requiring a different bin_precision (4?).
It doesn't seem to be on the decoding end, at least initially.
Breaking apart grabMSdata("~/../Desktop/test_tmz.tmzML")$MS2[premz%between%pmppm(90.0555)]
:
msdata_obj <- list(MS1=NULL, MS2=NULL, connection=list(
files="~/../Desktop/test_tmz.tmzML", grab_what="MS2", verbosity=0
))
class(msdata_obj) <- "msdata_connection"
isub <- substitute(mz%between%pmppm(90.0555, 10))
function_name <- as.character(isub[[1]])
col_name <- as.character(isub[[2]])
mz_lims <- eval.parent(isub[[3]])
filename <- msdata_obj$connection$files[1]
tmzml <- xml2::read_xml(filename)
mz_selectors <- paste0("@minmz<", max(mz_lims), " and @maxmz>", min(mz_lims))
mz_xpath <- paste0("data/", msdata_obj[["connection"]][["grab_what"]],
"/dubset[", mz_selectors, "]")
dubset_node <- xml2::xml_find_all(tmzml, mz_xpath)
raw_encoded <- xml2::xml_text(xml2::xml_children(dubset_node))
mzint_nodes <- raw_encoded[4]
compression_type="gzip"
bin_precision = 8
endi_enc="little"
decoded_mzs <- base64enc::base64decode(mzint_nodes)
decomp_mzs <- memDecompress(decoded_mzs, type = compression_type)
readBin(decomp_mzs, what = "double", n=length(decomp_mzs)/bin_precision,
size = bin_precision, endian = endi_enc)
yEP seems to be an encoding problem because "voltage" is an integer:
This one works just fine:
voltage_vals <- rep(35, 34)
voltage_enc <- giveEncoding(mzint_vals = voltage_vals, compression_type = "gzip", bin_precision = 8, endi_enc = "little")
getEncoded(mzint_nodes = voltage_enc, compression_type = "gzip", bin_precision = 8, endi_enc = "little")
This doesn't seem fixable by just changing the precision to 4 - we just get different, equally wrong numbers that way. Time to coerce the integer to numeric? while THIS one returns the problem we saw before:
voltage_vals <- rep(35L, 34)
voltage_enc <- giveEncoding(mzint_vals = voltage_vals, compression_type = "gzip", bin_precision = 8, endi_enc = "little")
getEncoded(mzint_nodes = voltage_enc, compression_type = "gzip", bin_precision = 8, endi_enc = "little")
Closed with #14
produces
but
produces
Pretty clearly an encoding error but I'm not sure where exactly it's coming from, will look into soon.