VEuPathDB / microbiomeComputations

1 stars 0 forks source link

Correlation error incompatible number of rows #72

Closed asizemore closed 10 months ago

asizemore commented 11 months ago

Found in a few studies, most recently in corr v metadata with

Screen Shot 2023-12-18 at 10 27 54 AM

Error

Error in .local(data1, data2, method, verbose, ...) :
  data1 and data2 must have the same number of rows.
asizemore commented 11 months ago

Note this is probably that the metadata var is missing data for a sample

d-callan commented 10 months ago

wouldnt that produce an empty row? rather than a missing one? i was going to guess a sample failed the dada2 workflow and so has no associated assay record

asizemore commented 10 months ago

I thought we restricted to complete cases somewhere, so if one sample had NA for age then that sample gets kicked out and we'd get differing numbers of rows. I also thought that we already removed all samples without assay records because why have them for mbio, but maybe that didn't happen. Not sure

d-callan commented 10 months ago

The check from completeness is baked into the correlation calculation I'm pretty sure. Do a sample that doesn't have age but does have height will still be included in the height calculation for ex. The row remains, and is contextually ignored.

I'm pretty sure they removed the assay record and left the upstream records. But I'll check

asizemore commented 10 months ago

I'm pretty sure they removed the assay record and left the upstream records. But I'll check

Ahhhh that could be it. But that also confuses me because AbundanceData has a check to ensure the sampleids from the metadata and sample ids from the assay data are the same. Wouldn't we see an error from making the AbundanceData object if we were missing assay records?

d-callan commented 10 months ago

hmmmm. yea.. plus, were asking for both the taxa and metadata as a single stream, which means we shouldnt get back samples without assays in the first place. this is becoming interesting.

d-callan commented 10 months ago

its bc there isnt actually any metadata.. the AbundanceData object checks the ids, which exist. but the helper to getSampleMetadata has an option to strip the ids. when it does this it returns a completely empty 0 rows/ 0 cols data.table.

I think itd make sense to add a warning to the getSampleMetadata helper about this case, but otherwise rely on graying the app to handle this case.

d-callan commented 10 months ago

see https://github.com/VEuPathDB/microbiomeComputations/pull/74