hbc / bcbioRNASeq

R package for bcbio RNA-seq analysis.
https://bioinformatics.sph.harvard.edu/bcbioRNASeq
GNU Affero General Public License v3.0
58 stars 21 forks source link

What does the ```|>``` symbol mean in the new ```Functional Annotation``` rmarkdown template mean? #176

Closed jxshi closed 3 years ago

jxshi commented 3 years ago

Hi,

I recently upgraded bcbioRNASeq to the latest version. And I noticed that there are significant changes in the rmarkdown template for Functional Annotation. What does the |> symbol mean in the following code? I have googled the symbol around but found no clue.

kegg_code <-
    organism |>
    search_kegg_organism("scientific_name") |>
    getElement("kegg_code")
assert(
    hasLength(kegg_code, n = 1L),
    isMatchingRegex(pattern = "^[a-z]{3}$", x = kegg_code),
    isSubset(params$go_class, c("BP", "CC", "MF"))
)

I rendered the new rmarkdown template but it went into error with the following error message:

Attaching package: 'AnnotationDbi'

The following object is masked from 'package:clusterProfiler':

    select

  |................                                                      |  23%
  ordinary text without R code

  |..................                                                    |  26%
label: check-db-codes
Quitting from lines 144-153 (mpncomb_PMF_vs_control_FuncAnno_BP.Rmd)
Error in parse(text = x, srcfile = src) : <text>:2:15: unexpected '>'
1: kegg_code <-
2:     organism |>
                 ^
Calls: <Anonymous> ... <Anonymous> -> parse_all -> parse_all.character -> parse

Execution halted

Can you check for me, please? Thank you for your time!

Best, Jianxiang

mjsteinbaugh commented 3 years ago

Hi @jxshi sure I'll take a look. The |> symbol is the new native pipe in R that works like the magrittr pipe %>%. Try it out!

jxshi commented 3 years ago

Hi @mjsteinbaugh ,

Thank you for clarifying. I updated R to 4.1.1 and the |> symbol was correctly interpreted. However, I was still unable to knit the rmarkdown file. It encountered another error message:

label: get-deseqresults
→ phenotype_PMF_vs_control (shrunken LFC)
  |.........................                                             |  36%
  ordinary text without R code

  |...........................                                           |  38%
label: results
→ Dropping genes without an adjusted P value.
ℹ 113 differentially expressed genes (alpha < 0.05)
Quitting from lines 179-198 (mpncomb_PMF_vs_control_FuncAnno_BP.Rmd)
Error in h(simpleError(msg, call)) :
  error in evaluating the argument 'x' in selecting a method for function 'rownames': object 'sig_res_df' not found
Calls: <Anonymous> ... eval -> eval -> rownames -> .handleSimpleError -> h

Execution halted

I checked the rmarkdown template and changed sig_res_df to sig_res. It went through. However, another new error popped up.

label: ranked-list
! Dropping 32827 genes without an Entrez identifier.
→ Averaging 'stat' value for 294 genes: 27328, 2953, 7554, 51622, 55871, 55747, 79937, 283971, 8294, 11013, 801, 25832, 90557, 3020, 653.....
→ Averaging 'stat' value for 296 genes: 27328, 2953, 7554, 51622, 55871, 55747, 79937, 283971, 8294, 11013, 801, 25832, 90557, 3020, 653.....
→ Averaging 'stat' value for 296 genes: 27328, 2953, 7554, 51622, 55871, 55747, 79937, 283971, 8294, 11013, 801, 25832, 90557, 3020, 653.....
→ Averaging 'stat' value for 296 genes: 27328, 2953, 7554, 51622, 55871, 55747, 79937, 283971, 8294, 11013, 801, 25832, 90557, 3020, 653.....
→ Averaging 'stat' value for 296 genes: 27328, 2953, 7554, 51622, 55871, 55747, 79937, 283971, 8294, 11013, 801, 25832, 90557, 3020, 653.....
→ Averaging 'stat' value for 296 genes: 27328, 2953, 7554, 51622, 55871, 55747, 79937, 283971, 8294, 11013, 801, 25832, 90557, 3020, 653.....
→ Averaging 'stat' value for 294 genes: 27328, 2953, 7554, 51622, 55871, 55747, 79937, 283971, 8294, 11013, 801, 25832, 90557, 3020, 653.....
→ Averaging 'stat' value for 297 genes: 27328, 2953, 7554, 51622, 55871, 55747, 79937, 283971, 8294, 11013, 801, 25832, 90557, 3020, 653.....
→ Averaging 'stat' value for 296 genes: 27328, 2953, 7554, 51622, 55871, 55747, 79937, 283971, 8294, 11013, 801, 25832, 90557, 3020, 653.....
→ Averaging 'stat' value for 297 genes: 27328, 2953, 7554, 51622, 55871, 55747, 79937, 283971, 8294, 11013, 801, 25832, 90557, 3020, 653.....
  |...........................................                           |  62%
  ordinary text without R code

  |.............................................                         |  64%
label: gse-go
preparing geneSet collections...
GSEA analysis...
Quitting from lines 315-350 (mpncomb_PMF_vs_control_FuncAnno_BP.Rmd) 
Error in mcfork(detached) : 
  unable to fork, possible reason: Cannot allocate memory
Calls: <Anonymous> ... bploop.lapply -> .send_to -> .send_to -> <Anonymous> -> mcfork

Execution halted

I checked the available memory with free -g and there was plenty to use.

              total        used        free      shared  buff/cache   available
Mem:            251          12         195           3          42         230
Swap:            63           0          63

Thank you very much!

Best, Jianxiang

mjsteinbaugh commented 3 years ago

@jxshi Would it be possible to post a link to the object you're using in the above post? It's not easy to reproduce this issue with any of the objects I've tested so far. I'm wondering if switching to the future package from BiocParallel may help in this case, but I can't test it easily.

jxshi commented 3 years ago

Hi @mjsteinbaugh,

I updated future package to the development version with the following command and it went through.

remotes::install_github("HenrikBengtsson/future", ref="develop")

Cheers!

Best, Jianxiang

mjsteinbaugh commented 3 years ago

@jxshi I haven't changed any of the code to use the future package yet, I'm just wondering if it's maybe more robust here instead of using BiocParallel