ChiLiubio / file2meco

Tranform files to the microtable object in microeco package
GNU General Public License v3.0
22 stars 3 forks source link

qiime2meco with data.frame doesn't work #24

Open guidohooiveld opened 2 weeks ago

guidohooiveld commented 2 weeks ago

Hi, I am exploring your powerful packages, and found an error with the function qiime2meco. When using as input for the argument sample_table a data.frame an error is reported. See below. I would like to use a data.frame rather than pointing to a *.tsv file directly because I am manipulating (cleaning-up) the sample_table a bit before before use.

Could you please have a look at this? Thanks! G

Reproducible example:

> library(microeco)
> library(file2meco)
> 
> abund_file_path <- system.file("extdata", "dada2_table.qza", package="file2meco")
> sample_file_path <- system.file("extdata", "sample-metadata.tsv", package="file2meco")
> taxonomy_file_path <- system.file("extdata", "taxonomy.qza", package="file2meco")
> 
> 
> ## works!
> results1 <- qiime2meco(abund_file_path,
+                        sample_table = sample_file_path, 
+                        taxonomy_table = taxonomy_file_path)
> 
> 
> ## using a data.frame for sample_table
> samples.df <- read.table(file = sample_file_path, sep = '\t', header = TRUE)
> 
> ## check
> class(samples.df)
[1] "data.frame"
> 
> str(samples.df)
'data.frame':   48 obs. of  9 variables:
 $ sample_name              : chr  "recip.220.WT.OB1.D7" "recip.290.ASO.OB2.D1" "recip.389.WT.HC2.D21" "recip.391.ASO.PD2.D14" ...
 $ barcode                  : chr  "CCTCCGTCATGG" "AACAGTAAACAA" "ATGTATCAATTA" "GTCAGTATGGCT" ...
 $ mouse_id                 : int  457 456 435 435 437 435 437 437 456 456 ...
 $ genotype                 : chr  "wild type" "susceptible" "susceptible" "susceptible" ...
 $ cage_id                  : chr  "C35" "C35" "C31" "C31" ...
 $ donor                    : chr  "hc_1" "hc_1" "hc_1" "hc_1" ...
 $ donor_status             : chr  "Healthy" "Healthy" "Healthy" "Healthy" ...
 $ days_post_transplant     : int  49 49 21 14 21 7 14 7 21 14 ...
 $ genotype_and_donor_status: chr  "wild type and Healthy" "susceptible and Healthy" "susceptible and Healthy" "susceptible and Healthy" ...
> 
> 
> ## doesn't work!
> results2 <- qiime2meco(abund_file_path,
+                        sample_table = samples.df, 
+                        taxonomy_table = taxonomy_file_path)
Error in readLines(file) : 'con' is not a connection
> 
> traceback()
7: readLines(file)
6: is.factor(x)
5: grepl("^#q2:types", readLines(file)[2])
4: withCallingHandlers(expr, warning = function(w) if (inherits(w, 
       classes)) tryInvokeRestart("muffleWarning"))
3: suppressWarnings(if (grepl("^#q2:types", readLines(file)[2])) {
       return(TRUE)
   } else {
       return(FALSE)
   })
2: is_q2metadata(sample_table)
1: qiime2meco(abund_file_path, sample_table = samples.df, taxonomy_table = taxonomy_file_path)
> 
> 
> 
> packageVersion("file2meco")
[1] ‘0.9.1’
> packageVersion("microeco")
[1] ‘1.10.1’
> 
> sessionInfo()
R version 4.4.2 (2024-10-31 ucrt)
Platform: x86_64-w64-mingw32/x64
Running under: Windows 10 x64 (build 19042)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.utf8 
[2] LC_CTYPE=English_United States.utf8   
[3] LC_MONETARY=English_United States.utf8
[4] LC_NUMERIC=C                          
[5] LC_TIME=English_United States.utf8    

time zone: Europe/Amsterdam
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] file2meco_0.9.1 microeco_1.10.1

loaded via a namespace (and not attached):
 [1] Matrix_1.7-1        gtable_0.3.6        vegan_2.6-8        
 [4] dplyr_1.1.4         compiler_4.4.2      tidyselect_1.2.1   
 [7] Rcpp_1.0.13-1       stringr_1.5.1       rhdf5filters_1.18.0
[10] parallel_4.4.2      tidyr_1.3.1         cluster_2.1.6      
[13] splines_4.4.2       scales_1.3.0        yaml_2.3.10        
[16] lattice_0.22-6      plyr_1.8.9          ggplot2_3.5.1      
[19] R6_2.5.1            generics_0.1.3      igraph_2.1.1       
[22] MASS_7.3-61         tibble_3.2.1        munsell_0.5.1      
[25] pillar_1.9.0        RColorBrewer_1.1-3  rlang_1.1.4        
[28] utf8_1.2.4          stringi_1.8.4       cli_3.6.3          
[31] withr_3.0.2         Rhdf5lib_1.28.0     magrittr_2.0.3     
[34] mgcv_1.9-1          digest_0.6.37       grid_4.4.2         
[37] permute_0.9-7       rhdf5_2.50.0        lifecycle_1.0.4    
[40] nlme_3.1-166        vctrs_0.6.5         glue_1.8.0         
[43] data.table_1.16.2   ape_5.8             fansi_1.0.6        
[46] colorspace_2.1-1    purrr_1.0.2         reshape2_1.4.4     
[49] tools_4.4.2         pkgconfig_2.0.3    
> 
> 
> 
ChiLiubio commented 2 weeks ago

Hi @guidohooiveld I carefully checked it and found it is a bug. The reason is this type of data.frame reading method comes from another function humann2meco, but in qiime2meco the structure of the function is a little different. Meanwhile I forgot to test this part in my test code. I have fixed it in the github. Please reinstall the file2meco v0.9.1 from github with the following code:

devtools::install_github("ChiLiubio/file2meco")

Thank you very much for your find!

Best, Chi

ChiLiubio commented 1 week ago

Hi. Now the file2meco v0.9.1 in CRAN is also available.