mschubert / clustermq

R package to send function calls as jobs on LSF, SGE, Slurm, PBS/Torque, or each via SSH
https://mschubert.github.io/clustermq/
Apache License 2.0
146 stars 27 forks source link

using backticks in a function causes an error #202

Closed nick-youngblut closed 4 years ago

nick-youngblut commented 4 years ago

I'm running clustermq on a function that includes:

rename('P' = `Pr(>F)`)

...which generates the error:

(Error #1) unused argument (P = `Pr(>F)`)

If I remove the rename line from the function, the function then works correctly.

SessionInfo:

R version 4.0.0 (2020-04-24)
Platform: x86_64-conda_cos6-linux-gnu (64-bit)
Running under: Ubuntu 18.04.4 LTS

Matrix products: default
BLAS/LAPACK: /ebio/abt3_projects/Georg_animal_feces/envs/phyloseq-physig/lib/libopenblasp-r0.3.9.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods  
[8] base     

other attached packages:
 [1] uuid_0.1-4        clustermq_0.8.9   phyloseq_1.32.0   LeyLabRMisc_0.1.6
 [5] doParallel_1.0.15 iterators_1.0.12  foreach_1.5.0     RRPP_0.6.0       
 [9] ape_5.4           ggplot2_3.3.1     tidyr_1.1.0       dplyr_1.0.0      

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.4.6         lattice_0.20-41      prettyunits_1.1.1   
 [4] Biostrings_2.56.0    digest_0.6.25        RhpcBLASctl_0.20-137
 [7] IRdisplay_0.7.0      R6_2.4.1             plyr_1.8.6          
[10] repr_1.1.0           stats4_4.0.0         evaluate_0.14       
[13] pillar_1.4.4         progress_1.2.2       zlibbioc_1.34.0     
[16] rlang_0.4.6          data.table_1.12.8    vegan_2.5-6         
[19] S4Vectors_0.26.0     Matrix_1.2-18        splines_4.0.0       
[22] stringr_1.4.0        igraph_1.2.5         munsell_0.5.0       
[25] compiler_4.0.0       pkgconfig_2.0.3      BiocGenerics_0.34.0 
[28] base64enc_0.1-3      multtest_2.44.0      rzmq_0.9.7          
[31] mgcv_1.8-31          htmltools_0.4.0      biomformat_1.16.0   
[34] tidyselect_1.1.0     tibble_3.0.1         IRanges_2.22.1      
[37] codetools_0.2-16     permute_0.9-5        crayon_1.3.4        
[40] withr_2.2.0          MASS_7.3-51.6        grid_4.0.0          
[43] nlme_3.1-148         jsonlite_1.6.1       gtable_0.3.0        
[46] lifecycle_0.2.0      magrittr_1.5         scales_1.1.1        
[49] stringi_1.4.6        XVector_0.28.0       reshape2_1.4.4      
[52] ellipsis_0.3.1       generics_0.0.2       vctrs_0.3.1         
[55] IRkernel_1.1         Rhdf5lib_1.10.0      tools_4.0.0         
[58] ade4_1.7-15          Biobase_2.48.0       glue_1.4.1          
[61] purrr_0.3.4          hms_0.5.3            survival_3.1-12     
[64] colorspace_1.4-1     rhdf5_2.32.0         cluster_2.1.0       
[67] pbdZMQ_0.3-3   
mschubert commented 4 years ago

Can you please provide a minimal reproducible example?

For instance, the following works for me:

fx = function(x) dplyr::rename(x, 'P' = `Pr(>F)`)
clustermq::Q(fx, x=list(tibble::tibble(`Pr(>F)` = 5)), n_jobs=1)
nick-youngblut commented 4 years ago

I can't seem to reproduce the issue, mainly because I've been messing with the code that I was using and now I can't get the function upstream of the rename to work. I'm trying to run RRPP::lm.rrpp, which works just fine with lapply or plyr::llply (with doParallel), but when running it with clustermq, I always get:

(Error #1) NA/NaN argument
(#1) no non-missing arguments to max; returning -Inf

I've tried so many things to try to get it working, but nothing helps.

There's not much in regards to the error reported, so it's hard to determine what the problem is. The entire error:

Error in summarize_result(job_result, n_errors, n_warnings, cond_msgs, : 3/3 jobs failed (3 warnings). Stopping.
(Error #1) NA/NaN argument
(#1) no non-missing arguments to max; returning -Inf
(Error #2) NA/NaN argument
(#2) no non-missing arguments to max; returning -Inf
(Error #3) NA/NaN argument
(#3) no non-missing arguments to max; returning -Inf
Traceback:

1. Q(.rrpp_diet, tree = host_tree_l %>% head, const = list(otu = otu, 
 .     taxon = taxa[1], iter = iter), n_jobs = 50, job_size = 1, 
 .     pkgs = c("plyr", "dplyr", "tidyr", "doParallel", "ape", "RRPP"), 
 .     template = tmpl)
2. Q_rows(fun = fun, df = df, const = const, export = export, pkgs = pkgs, 
 .     seed = seed, memory = memory, template = template, n_jobs = n_jobs, 
 .     job_size = job_size, rettype = rettype, fail_on_error = fail_on_error, 
 .     workers = workers, log_worker = log_worker, chunk_size = chunk_size, 
 .     timeout = timeout, max_calls_worker = max_calls_worker, verbose = verbose)
3. master(qsys = workers, iter = df, rettype = rettype, fail_on_error = fail_on_error, 
 .     chunk_size = chunk_size, timeout = timeout, max_calls_worker = max_calls_worker, 
 .     verbose = verbose)
4. summarize_result(job_result, n_errors, n_warnings, cond_msgs, 
 .     min(submit_index) - 1, fail_on_error)
5. stop(msg, ". Stopping.\n", detail)
nick-youngblut commented 4 years ago

What I don't understand is why I don't get a more descriptive error in the qsub job log file. An example of one of the jobs:

R version 4.0.0 (2020-04-24) -- "Arbor Day"
Copyright (C) 2020 The R Foundation for Statistical Computing
Platform: x86_64-conda_cos6-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> clustermq:::worker("tcp://rick:7060")
2020-06-22 19:47:42.428020 | Master: tcp://rick:7060
2020-06-22 19:47:42.452442 | WORKER_UP to: tcp://rick:7060
2020-06-22 19:47:42.479853 | > DO_SETUP (0.019s wait)
2020-06-22 19:47:42.480455 | token from msg: babod

Attaching package: ‘dplyr’

The following objects are masked from ‘package:plyr’:

    arrange, count, desc, failwith, id, mutate, rename, summarise,
    summarize

The following objects are masked from ‘package:stats’:

    filter, lag

The following objects are masked from ‘package:base’:

    intersect, setdiff, setequal, union

Loading required package: foreach
Loading required package: iterators
Loading required package: parallel
2020-06-22 19:47:43.980770 | > DO_CHUNK (0.001s wait)
2020-06-22 19:47:44.187067 | completed 1 in 0.20s [user], 0.00s [system], 0.21s [elapsed]
2020-06-22 19:47:44.197363 | > WORKER_STOP (0.000s wait)
2020-06-22 19:47:44.197916 | shutting down worker
2020-06-22 19:47:44.277696 |
Total: 1 in 1.30s [user], 0.08s [system], 1.75s [elapsed]
>
>

My template:

#!/bin/bash
#$ -N {{ job_name }}                    # job name
#$ -pe parallel {{ cores | 1 }}         # job threads
#$ -l h_rt={{ job_time | 00:59:00 }}    # job time
#$ -l h_vmem={{ job_mem | 7G }}         # job memory
#$ -t 1-{{ n_jobs }}                    # submit jobs as array
#$ -j y                                 # combine stdout/error in one file
#$ -o {{ log_file | /dev/null }}        # output log file
#$ -cwd                                 # use pwd as work dir
#$ -V                                   # use environment variable

. ~/.bashrc
conda activate {{ conda | py3 }}

export OMP_NUM_THREADS={{ omp.threads | 1 }}
export OPENBLAS_NUM_THREADS={{ blas.threads | 1 }}
export MKL_NUM_THREADS={{ mkl.threads | 1 }}

#ulimit -v $(( 1024 * {{ memory | 4096 }} ))
CMQ_AUTH={{ auth }} R --no-save --no-restore -e 'clustermq:::worker("{{ master }}")'
mschubert commented 4 years ago

I would suggest you:

What I don't understand is why I don't get a more descriptive error in the qsub job log file. An example of one of the jobs

Worker logs keep track of processing. They are not concerned about any errors your R code produces: Rather, an error in an evaluation is a result that will be reported back. This is why you see (Error #1) NA/NaN argument. You just, unfortunately, can not get a traceback on it.

nick-youngblut commented 4 years ago

Thanks for the clarification! I understand the need for a reprex. I'm just having a hard time creating one. It seems somehow specific to my dataset. I'll keep looking.

mschubert commented 4 years ago

Ok! Please reopen if you find one.