grunwaldlab / metacoder

Parsing, Manipulation, and Visualization of Metabarcoding/Taxonomic data
http://grunwaldlab.github.io/metacoder_documentation
Other
135 stars 28 forks source link

Compare_groups with phyloseq derived file #247

Closed StefanPfeiffer80 closed 6 years ago

StefanPfeiffer80 commented 6 years ago

Hi Zach, I started to use metacoder and I think it is really a very interesting tool. I am using it with a phyloseq object, and I have a question regarding compare_groups when using data from phyloseq objects. First I parsed phyloseq, then print the taxmap file:

obj<- parse_phyloseq(physeq1, class_regex = "(.*)", class_key = "taxon_name")
Warning message:
There is no "taxon_id" column in the data set "3", so there are no taxon IDs. 
print(obj)
<Taxmap>
  840 taxa: aab. d_Bacteria ... bgi. g_Parasutterella
  840 edges: NA->aab, aab->aac, aab->aad ... aii->bgh, anc->bgi
  3 data sets:
    otu_table:
      # A tibble: 4,227 x 165
        taxon_id otu_id StefBSNS1 StefBSNS10 StefBSNS11 StefBSNS12
        <chr>    <chr>      <dbl>      <dbl>      <dbl>      <dbl>
      1 and      Otu9      0.0440     0.0994      0.374      0.551
      2 ane      Otu3      0.154      0.286      29.0        0.657
      3 anf      Otu7      0.115      0.0795      0.476      0    
      # ... with 4,224 more rows, and 159 more variables:
      #   StefBSNS13 <dbl>, StefBSNS14 <dbl>, StefBSNS15 <dbl>,
      #   StefBSNS16 <dbl>, StefBSNS17 <dbl>, StefBSNS18 <dbl>,
      #   StefBSNS19 <dbl>, StefBSNS2 <dbl>, StefBSNS20 <dbl>,
      #   StefBSNS21 <dbl>, ...
    tax_data:
      # A tibble: 4,227 x 7
        taxon_id Domain   Phylum    Class     Order    Family    Genus   
        <chr>    <chr>    <chr>     <chr>     <chr>    <chr>     <chr>   
      1 and      d_Bacte~ p_Firmic~ c_Negati~ o_Selen~ f_Veillo~ g_Veill~
      2 ane      d_Bacte~ p_Firmic~ c_Bacilli o_Lacto~ f_Strept~ g_Strep~
      3 anf      d_Bacte~ p_Bacter~ c_Bacter~ o_Bacte~ f_Prevot~ g_Prevo~
      # ... with 4,224 more rows
    sample_data:
      # A tibble: 163 x 46
        sample_id X.SampleID Cohort_Sample Type  ID_Proband ID_Intern
        <chr>     <chr>      <chr>         <chr> <chr>      <chr>    
      1 StefBSNS1 StefBSNS1  NSGK          Nasa~ GeKoFZBBO~ GK1      
      2 StefBSNS~ StefBSNS10 NSGK          Nasa~ GeKoFZBBO~ GK12     
      3 StefBSNS~ StefBSNS11 NSGK          Nasa~ GeKoFZBBO~ GK13     
      # ... with 160 more rows, and 40 more variables: Sex <chr>,
      #   SG <chr>, Smoking_habits <chr>, No_alcohol <chr>,
      #   allergy <chr>, no.asthma <chr>, child.asthma <chr>,
      #   neurodermatitis <chr>, diabetes <chr>,
      #   high.blood.pressure <chr>, ...
  0 functions:

I realize that OTU table is the tax_data from the tutorial, but that does not seem to be a problem. Making a heat_tree works.

But when I try to compare groups, I get the following error message:

physeq_mc$data$diff_table <- compare_treatments(physeq_mc,
                                          dataset = "tax_table",
                                          sample_ids = obj$sample_data$sample_id,
                                          treatments = obj$sample_data$Sex)
Error in compare_groups(obj, dataset = "otu_table", sample_ids = obj$sample_data$sample_id,  : 
  unused arguments (sample_ids = obj$sample_data$sample_id, treatments = obj$sample_data$Sex)

Can the comparison be done using the sample_data of the Taxmap object, or is a seperate file needed?

Thanks for your help!

Cheers, Stefan

zachary-foster commented 6 years ago

Hello Stefan,

I looks like you might be using an older version of metacoder. Make sure the package is up to date and try:

physeq_mc$data$diff_table <- compare_groups(physeq_mc,
                                          data = "tax_table",
                                          cols = obj$sample_data$sample_id,
                                          groups = obj$sample_data$Sex)

The sample data table can be separate or included in the taxmap object. The only reason it is included in the taxmap object here is because it was included in the phyloseq object

StefanPfeiffer80 commented 6 years ago

Thank you Zach for the quick reply. I have reinstalled, now I have package v.0.2.1.9011, compare_groups is in it so I guess it is right. However, I got another error:

obj$data$diff_table <- compare_groups(obj,
+                                             data = "tax_table",
+                                             cols = obj$sample_data$sample_id,
+                                             groups = obj$sample_data$Sex)
No `cols` specified, so using all numeric columns:
   StefBSNS1, StefBSNS10, StefBSNS11 ... StefBS48, StefBS51, StefBS52
Error in utils::combn(unique(groups), 2) : n < m

I also went again through your tutorials and I wonder if the problem is in my phyloseq derived Taxmap object (see below). What do you think?

> obj
<Taxmap>
  840 taxa: aab. d_Bacteria ... bgi. g_Parasutterella
  840 edges: NA->aab, aab->aac, aab->aad ... aii->bgh, anc->bgi
  5 data sets:
    otu_table:
      # A tibble: 4,227 x 164
        taxon_id StefBSNS1 StefBSNS10 StefBSNS11 StefBSNS12 StefBSNS13
        <chr>        <dbl>      <dbl>      <dbl>      <dbl>      <dbl>
      1 and       0.000254   0.000507    0.00558    0.00583    0.00178
      2 ane       0.00152    0.00228     0.292      0.00685    0.0101 
      3 anf       0.00178    0.000761    0.00381    0          0.00533
      # ... with 4,224 more rows, and 158 more variables:
      #   StefBSNS14 <dbl>, StefBSNS15 <dbl>, StefBSNS16 <dbl>,
      #   StefBSNS17 <dbl>, StefBSNS18 <dbl>, StefBSNS19 <dbl>,
      #   StefBSNS2 <dbl>, StefBSNS20 <dbl>, StefBSNS21 <dbl>,
      #   StefBSNS22 <dbl>, ...
    tax_data:
      # A tibble: 4,227 x 7
        taxon_id Domain   Phylum    Class     Order    Family    Genus   
        <chr>    <chr>    <chr>     <chr>     <chr>    <chr>     <chr>   
      1 and      d_Bacte~ p_Firmic~ c_Negati~ o_Selen~ f_Veillo~ g_Veill~
      2 ane      d_Bacte~ p_Firmic~ c_Bacilli o_Lacto~ f_Strept~ g_Strep~
      3 anf      d_Bacte~ p_Bacter~ c_Bacter~ o_Bacte~ f_Prevot~ g_Prevo~
      # ... with 4,224 more rows
    sample_data:
      # A tibble: 163 x 46
        sample_id X.SampleID Cohort_Sample Type  ID_Proband ID_Intern
        <chr>     <chr>      <chr>         <chr> <chr>      <chr>    
      1 StefBSNS1 StefBSNS1  NSGK          Nasa~ GeKoFZBBO~ GK1      
      2 StefBSNS~ StefBSNS10 NSGK          Nasa~ GeKoFZBBO~ GK12     
      3 StefBSNS~ StefBSNS11 NSGK          Nasa~ GeKoFZBBO~ GK13     
      # ... with 160 more rows, and 40 more variables: Sex <chr>,
      #   SG <chr>, Smoking_habits <chr>, No_alcohol <chr>,
      #   allergy <chr>, no.asthma <chr>, child.asthma <chr>,
      #   neurodermatitis <chr>, diabetes <chr>,
      #   high.blood.pressure <chr>, ...
    tax_table:
      # A tibble: 840 x 164
        taxon_id StefBSNS1 StefBSNS10 StefBSNS11 StefBSNS12 StefBSNS13
      * <chr>        <dbl>      <dbl>      <dbl>      <dbl>      <dbl>
      1 aab        1          1           1         1           1     
      2 aac        0.509      0.777       0.646     0.337       0.136 
      3 aad        0.00431    0.00863     0.0315    0.00228     0.0261
      # ... with 837 more rows, and 158 more variables: StefBSNS14 <dbl>,
      #   StefBSNS15 <dbl>, StefBSNS16 <dbl>, StefBSNS17 <dbl>,
      #   StefBSNS18 <dbl>, StefBSNS19 <dbl>, StefBSNS2 <dbl>,
      #   StefBSNS20 <dbl>, StefBSNS21 <dbl>, StefBSNS22 <dbl>, ...
zachary-foster commented 6 years ago

Hi Stefan,

I think I have seen this before recently. Does obj$sample_data$sample_id and obj$sample_data$Sex exist in your case, or are they NULL?

StefanPfeiffer80 commented 6 years ago

Thanks Zach, in fact they were NULL, typing obj$data$sample_data$sample_id made it work.

zachary-foster commented 6 years ago

Great! Let me know if you have other questions