nf-core / airrflow

B-cell and T-cell Adaptive Immune Receptor Repertoire (AIRR) sequencing analysis pipeline using the Immcantation framework
https://nf-co.re/airrflow
MIT License
52 stars 34 forks source link

Bulk RNAseq pcr_target_locus specification #311

Closed loipf closed 5 months ago

loipf commented 6 months ago

Description of the bug

Thanks for the great tool, it offers a really wide range of analysis in a convenient way! I would like to analyze bulk (untargeted) blood RNAseq data using your pipeline. To do this, I ran MIXCR in RNAseq mode to generate airr files, which I fed into the airrflow pipeline.

However, I am unsure what to use as "pcr_target_locus" for the assembled input sample sheet since I do not have a specific target locus? How important is this parameter and will different tools and output be generated based on it?

Specifying "pcr_target_locus" as "IG" works fine, but will I miss the TCR analysis? When I specified both "IG" and "TR" in two lines for each sample, I got the error attached at the end.

Could you advise if bulk "normal" RNAseq analysis is suitable for your pipeline at all, or would you recommend against it? Thanks in advance.

Command used and terminal output

nextflow run nf-core/airrflow -r 3.2.0 -profile docker --mode assembled --input samplesheet.tsv --outdir airr_output

-[nf-core/airrflow] Pipeline completed with errors-
ERROR ~ Error executing process > 'NFCORE_AIRRFLOW:AIRRFLOW:CLONAL_ANALYSIS:DOWSER_LINEAGES (subject2)'

Caused by:
  Process `NFCORE_AIRRFLOW:AIRRFLOW:CLONAL_ANALYSIS:DOWSER_LINEAGES (subject2)` terminated with an error exit status (1)

Command executed:

  Rscript -e "enchantr::enchantr_report('dowser_lineage', \
                                          report_params=list('input'='subject2__clone-pass.tsv', \
                                          'exec'='/usr/local/share/igphyml/src/igphyml', \
                                          'outdir'=getwd(), \
                                          'nproc'=10,\
                                          'log'='subject2_dowser_command_log' ,'build'='igphyml','minseq'=5,'tips'='c_call','traits'='c_call'))"

      cp -r enchantr subject2_dowser_report && rm -rf enchantr

      echo "NFCORE_AIRRFLOW:AIRRFLOW:CLONAL_ANALYSIS:DOWSER_LINEAGES": > versions.yml
      Rscript -e "cat(paste0('  enchantr: ',packageVersion('enchantr'),'
  '))" >> versions.yml

Command exit status:
  1

Command output:
  1/18                   
  2/18 [global-options]  
  3/18                   
  4/18 [input-parameters]
  5/18 [unnamed-chunk-1] 
  6/18                   
  7/18 [read-repertoires]
  8/18                   
  9/18 [formatClones]    

Command error:

  processing file: _main.Rmd
  1/18                   
  2/18 [global-options]  
  3/18                   
  4/18 [input-parameters]
  5/18 [unnamed-chunk-1] 
  6/18                   
  7/18 [read-repertoires]
  8/18                   
  9/18 [formatClones]    

  Quitting from lines 153-206 [formatClones] (_main.Rmd)
  Error:
  ! multiple heavy chain locus found.
  Warning messages:
  1: replacing previous import 'data.table::last' by 'dplyr::last' when loading 'enchantr' 
  2: replacing previous import 'data.table::first' by 'dplyr::first' when loading 'enchantr' 
  3: replacing previous import 'data.table::between' by 'dplyr::between' when loading 'enchantr' 
  Execution halted

Relevant files

samplesheet.tsv.zip

System information

Nextflow version: 23.10.1 Hardware: Desktop Executor: local Container engine: Docker OS: Ubuntu 20.04.6 LTS Version of nf-core/airrflow: 3.2.0

ggabernet commented 6 months ago

Hi @loipf ,

thanks for trying out nf-core/airrflow and I'm glad it can be useful for your research! Unfortunately it is not possible to run together so far IG and TR analysis. This parameter specifies which reference data to use, either the one for BCR or TCR. )ne needs to run the pipeline twice, once with the BCR samples and once with the TCR samples. Let me know whether this works for you!