claraqin / neonMicrobe

Processing NEON soil microbe marker gene sequence data into ASV tables.
GNU Lesser General Public License v3.0
9 stars 4 forks source link

Pipeline testing on Windows #26

Open KaiZhuPhD opened 3 years ago

KaiZhuPhD commented 3 years ago
> devtools::session_info()
- Session info ----------------------------------------------------------------------------------------------
 setting  value                       
 version  R version 4.0.3 (2020-10-10)
 os       Windows 10 x64              
 system   x86_64, mingw32             
 ui       RStudio                     
 language (EN)                        
 collate  English_United States.1252  
 ctype    English_United States.1252  
 tz       America/Los_Angeles         
 date     2020-11-04                  

- Packages --------------------------------------------------------------------------------------------------
 package     * version date       lib source        
 assertthat    0.2.1   2019-03-21 [1] CRAN (R 4.0.3)
 backports     1.1.10  2020-09-15 [1] CRAN (R 4.0.3)
 BiocManager   1.30.10 2019-11-16 [1] CRAN (R 4.0.3)
 callr         3.5.1   2020-10-13 [1] CRAN (R 4.0.3)
 cli           2.1.0   2020-10-12 [1] CRAN (R 4.0.3)
 crayon        1.3.4   2017-09-16 [1] CRAN (R 4.0.3)
 desc          1.2.0   2018-05-01 [1] CRAN (R 4.0.3)
 devtools      2.3.2   2020-09-18 [1] CRAN (R 4.0.3)
 digest        0.6.27  2020-10-24 [1] CRAN (R 4.0.3)
 ellipsis      0.3.1   2020-05-15 [1] CRAN (R 4.0.3)
 fansi         0.4.1   2020-01-08 [1] CRAN (R 4.0.3)
 fs            1.5.0   2020-07-31 [1] CRAN (R 4.0.3)
 glue          1.4.2   2020-08-27 [1] CRAN (R 4.0.3)
 knitr         1.30    2020-09-22 [1] CRAN (R 4.0.3)
 magrittr      1.5     2014-11-22 [1] CRAN (R 4.0.3)
 memoise       1.1.0   2017-04-21 [1] CRAN (R 4.0.3)
 pkgbuild      1.1.0   2020-07-13 [1] CRAN (R 4.0.3)
 pkgload       1.1.0   2020-05-29 [1] CRAN (R 4.0.3)
 prettyunits   1.1.1   2020-01-24 [1] CRAN (R 4.0.3)
 processx      3.4.4   2020-09-03 [1] CRAN (R 4.0.3)
 ps            1.4.0   2020-10-07 [1] CRAN (R 4.0.3)
 R6            2.5.0   2020-10-28 [1] CRAN (R 4.0.3)
 remotes       2.2.0   2020-07-21 [1] CRAN (R 4.0.3)
 rlang         0.4.8   2020-10-08 [1] CRAN (R 4.0.3)
 rprojroot     1.3-2   2018-01-03 [1] CRAN (R 4.0.3)
 rstudioapi    0.11    2020-02-07 [1] CRAN (R 4.0.3)
 sessioninfo   1.1.1   2018-11-05 [1] CRAN (R 4.0.3)
 testthat      3.0.0   2020-10-31 [1] CRAN (R 4.0.3)
 usethis       1.6.3   2020-09-17 [1] CRAN (R 4.0.3)
 withr         2.3.0   2020-09-22 [1] CRAN (R 4.0.3)
 xfun          0.19    2020-10-30 [1] CRAN (R 4.0.3)

[1] C:/Users/kai.zhu/Documents/R/win-library/4.0
[2] C:/Program Files/R/R-4.0.3/library
KaiZhuPhD commented 3 years ago

Error messages from running ./testing/download-neon-data-metadataworkaround.Rmd:

> source("./code/params.R")
Error in system2("which", args = "cutadapt", stdout = TRUE) : 
  '"which"' not found

> reorganized_files <- organizeRawSequenceData(fn, meta)
Error in organizeRawSequenceData(fn, meta) : 
  All sequencer run ID values are NA. Cannot proceed with reorganizing this download batch.
                              Try a different `fn` or `metadata`.

Also suggest better organized "data" folder.

KaiZhuPhD commented 3 years ago

Error messages from running ./testing/process-16s-sequences-to-seqtabs.Rmd

> source("./code/params.R")
Error in system2("which", args = "cutadapt", stdout = TRUE) : 
  '"which"' not found

The above error makes loading params.R failed. Can't proceed with following steps.

claraqin commented 3 years ago

Hi Kai,

Thanks for posting about this. It seems like a number of people have had a problem with the which command via system2. So I've removed it in the most recent commit. What this does mean, though, is that you will have to find the location of cutadapt on your own file system, and enter it manually into the CUTADAPT_PATH argument in params.R.

claraqin commented 3 years ago

Regarding this error:

> reorganized_files <- organizeRawSequenceData(fn, meta)
Error in organizeRawSequenceData(fn, meta) : 
  All sequencer run ID values are NA. Cannot proceed with reorganizing this download batch.
                              Try a different `fn` or `metadata`.

Could you please print head(fn) and head(meta)?

Also, regarding this:

Also suggest better organized "data" folder.

Which data folder are you referring to? The data subdirectory shouldn't be relevant in the steps that are being tested.

KaiZhuPhD commented 3 years ago

Suggestions:

No other issues in ./testing/download-neon-data-metadataworkaround.Rmd.

KaiZhuPhD commented 3 years ago

In testing process-its-sequences-to-seqtabs.Rmd, I got the following error message:

trimPrimerITS: No files found at specified location(s) within C:/Users/kai.zhu/Documents/GitHub/NEON_soil_microbe_processing/NEON/raw_sequence/ITS/2_trimmed_mid. Check file path, or post_samplename_pattern argument(s).trimPrimerITS: No files found at specified location(s) within C:/Users/kai.zhu/Documents/GitHub/NEON_soil_microbe_processing/NEON/raw_sequence/ITS/2_trimmed. Check file path, or post_samplename_pattern argument(s).Error in rval[sapply(rval, is.character)] : invalid subscript type 'list'

claraqin commented 3 years ago

Hi Kai,

  1. Did you manage to address this issue?
> reorganized_files <- organizeRawSequenceData(fn, meta)
Error in organizeRawSequenceData(fn, meta) : 
  All sequencer run ID values are NA. Cannot proceed with reorganizing this download batch.
                              Try a different `fn` or `metadata`.
  1. Regarding this issue:
trimPrimerITS: No files found at specified location(s) within C:/Users/kai.zhu/Documents/GitHub/NEON_soil_microbe_processing/NEON/raw_sequence/ITS/2_trimmed_mid. Check file path, or post_samplename_pattern argument(s).trimPrimerITS: No files found at specified location(s) within C:/Users/kai.zhu/Documents/GitHub/NEON_soil_microbe_processing/NEON/raw_sequence/ITS/2_trimmed. Check file path, or post_samplename_pattern argument(s).Error in rval[sapply(rval, is.character)] : invalid subscript type 'list'

The most likely cause of this, I think, is that your R session wasn't able to find cutadapt. Are you able to check the error logs to see if this error was preceded by "error in running commandsh: ~/.local/bin/cutadapt: No such file or directory"? (If so, you will need to update CUTADAPT_PATH in params.R.)

claraqin commented 3 years ago

The main barrier to using Windows computers to run the processing pipeline was the lack of support for Cutadapt. This is only relevant for the ITS pipeline, and this may be addressed by the Docker container that @rbartelme will soon add to the repo.

I'm leaving this issue open in case any Windows users want to confirm that either (1) the 16S pipeline still works for them, or (2) the Docker container works for them.