VEuPathDB / EdaNewIssues

0 stars 0 forks source link

Map: Discuss how to handle Pathogen presence/ absence in megastudy #623

Closed d-callan closed 1 year ago

d-callan commented 1 year ago

I've had a look through most of the assay data in the megastudy at this point, to try to make sure that my plan for weighted averages, etc to accommodate differing sample sizes will work as intended. For the most part i think it will, with the exception of Pathogen presence/ absence. For a pooled sample, the most information this variable provides, if positive, is that at least one specimen was positive. The values are consequently not comparable across individuals and pooled samples.

Do we want to exclude this term from visualizations, but allow other pathogen assay variables? If the majority of pathogen data is of this type, that could be less than ideal. But we could also maybe build some dedicated tools to handle this data in the future using this R package.

d-callan commented 1 year ago

Decisions here https://github.com/VEuPathDB/EdaLoadingIssues/issues/67