waldronlab / curatedMetagenomicDataCuration

Sample Metadata Curation for curatedMetagenomicData
https://waldronlab.io/curatedMetagenomicDataCuration/
28 stars 23 forks source link

Assay Sometimes Has Zero Columns When counts = TRUE #77

Open schifferl opened 1 year ago

schifferl commented 1 year ago

Describe the bug

When data is requested using either the curatedMetagenomicData or returnSamples function with the argument counts = TRUE, the assay of the object to be returned will sometimes have zero columns.

To Reproduce

This bug can be reproduced with the MetaCardis_2020_a study as follows.

curatedMetagenomicData::curatedMetagenomicData(
    pattern = "MetaCardis_2020_a.relative_abundance",
    dryrun = FALSE,
    counts = TRUE,
    rownames = "short"
)

Expected behavior

The code above should return a TreeSummarizedExperiment with 697 rows and 1831 columns.

Additional context

This bug might apply to other relative_abundance studies, and the number to studies impacted should be verified (by downloading them with the counts = TRUE argument). This may be related to changes in TreeSummarizedExperiment or SummarizedExperiment. This issue closes waldronlab/curatedMetagenomicData#292.

schifferl commented 10 months ago

Upon further investigation, the issue arises when number_reads is curated incorrectly. In the case of the MetaCardis_2020_a study, values in the number_reads column all end in .0 which is read into R as NA rather than an integer. Fixing this issue in curatedMetagenomicDataCuration will resolve the issue in curatedMetagenomicData.