yge15 / TCGA_Microbial_Content

Comprehensive analysis of microbial content in whole-genome sequencing samples from The Cancer Genome Atlas project
4 stars 0 forks source link

TCGA Microbial Content

Comprehensive analysis of microbial content in 5,734 whole-genome sequencing samples from The Cancer Genome Atlas project

Supplementary Tables Captions

  1. Table S1. Raw counts of 5,734 TCGA WGS samples classified at the species level against the Kraken Microbial2023 database, excluding eukaryotes. The contents of Microbial2023 are described in the main text. This resulted in 11,349 species that have non-zero counts.
  2. Table S2. Normalized counts of the values in Table S1, converted to counts per million reads sequenced.
  3. Table S3. Normalized counts of the values in Table S2, converted to reads per kilobase of genome per million reads sequenced.
  4. Table S4. Raw counts of 5,734 TCGA WGS samples classified at the species level against Fungi_RefSeq database. The contents of Fungi_RefSeq include all 557 fungal species in RefSeq as of late 2023.
  5. Table S5. Normalized counts of TableS4, converted to counts per million reads sequenced.
  6. Table S6. Normalized counts of TableS4, converted to reads per kilobase of genome per million reads sequenced.
  7. Table S7. TCGA study abbreviations.
  8. Table S8. Raw counts of 4,550 TCGA WGS samples classified at the genus level against Microbial2023 database, for samples used in both this study and in the Poore et al. study. This dataset contains counts for all 3,561 genera that had non-zero counts in either study.
  9. Table S9. Raw counts from the Poore et al. study at the genus level, for samples used in both this study and Poore et al. This set contains 4,550 samples and 3,561 genera. Raw counts included here were taken directly from Poore et al. (2020).
  10. Table S10. Raw counts of TCGA WGS samples classified at the species level against the Fungi_RefSeq database, for samples matched with the Narunsky-Haziza et al. study. This resulted in 4,271 samples and 557 species.
  11. Table S11. Raw counts from the Narunsky-Haziza et al. study at the species level, for samples matched with this study. This resulted in 4,271 samples and 557 species. Raw counts included here were taken directly from Narunsky-Haziza et al. (2022).
  12. Table S12. TCGA metadata for 5,734 samples, including unique IDs for this study, IDs used by Poore et al., IDs used by Narunsky-Haziza et al., and the original TCGA identifiers.
  13. Table S13. List of species in the Microbial2023 Kraken database, with RefSeq accessions.
  14. Table S14. List of species in the Fungi_RefSeq Kraken database, with RefSeq accessions.
  15. Table S15. RefSeq v200 to v220 name conversion list, for 14 of the 557 fungal species whose names were changed between the two releases of RefSeq.