Oshlack / Clinker

Gene Fusion Visualiser
MIT License
51 stars 12 forks source link

bpipe error when running on test #1

Closed lachlansimpson closed 7 years ago

lachlansimpson commented 7 years ago

I installed this locally and bpipe failed with an error.

Clinker: git co of master bpipe: 0.9.9.3 samtools: 1.4.1 R: 3.4.0 python 2.7.5

Error:

Command    : python /config/binaries/clinker/1.2/Clinker/fusion/main.py -in caller/bcr_abl1.csv -out results -pos 1 2 3 4 -del c -genome 19 -competitive false
Exit Code  : 1
Output     : 

        ==============================================================

                Fusion Super Transcript Generator

                A fusion visualiser.

        ==============================================================

        ==============================================================

        Create fusion superTranscriptome:

        Traceback (most recent call last):
          File "/config/binaries/clinker/1.2/Clinker/fusion/main.py", line 53, in <module>
            fusions = fusiontools.createFusionList(fusion_results, pos, gene_list_location, st_genes, True, delimiter)
          File "/config/binaries/clinker/1.2/Clinker/fusion/fusiontools.py", line 437, in createFusionList
            chr_1, bp_1, chr_2, bp_2 = breakpointLocation(pos, columns)
          File "/config/binaries/clinker/1.2/Clinker/fusion/fusiontools.py", line 293, in breakpointLocation
            col_bp_1 = pos[1]
        IndexError: list index out of range

MSG: vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
MSG:     NOTE: Pipeline completed as process 26028.  Trailing lines follow.
MSG:     Use bpipe log -n <lines> to see more lines
MSG: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

==============================================================

Create fusion superTranscriptome:

Traceback (most recent call last):
  File "/config/binaries/clinker/1.2/Clinker/fusion/main.py", line 53, in <module>
    fusions = fusiontools.createFusionList(fusion_results, pos, gene_list_location, st_genes, True, delimiter)
  File "/config/binaries/clinker/1.2/Clinker/fusion/fusiontools.py", line 437, in createFusionList
    chr_1, bp_1, chr_2, bp_2 = breakpointLocation(pos, columns)
  File "/config/binaries/clinker/1.2/Clinker/fusion/fusiontools.py", line 293, in breakpointLocation
    col_bp_1 = pos[1]
IndexError: list index out of range
ERROR: stage generate_fst failed: Command in stage generate_fst failed with exit status = 1 : 

python /config/binaries/clinker/1.2/Clinker/fusion/main.py -in caller/bcr_abl1.csv -out results -pos 1 2 3 4 -del c -genome 19 -competitive false 

========================================= Pipeline Failed ==========================================

Command in stage generate_fst failed with exit status = 1 : 

python /config/binaries/clinker/1.2/Clinker/fusion/main.py -in caller/bcr_abl1.csv -out results -pos 1 2 3 4 -del c -genome 19 -competitive false

Use 'bpipe errors' to see output from failed commands.
breons commented 7 years ago

Hi there, thanks for trying Clinker!

I'm very sorry, I stuffed up the python command in the documentation, please use: python ../fusion/main.py -in caller/bcr_abl1.csv -out results -pos 1,2,3,4 -del c -genome 19 -competitive false

The -pos parameter should be comma delimited, not space.

Let me know how you go :).

breons commented 7 years ago

On second thought, just looking at this test... it hasn't been updated to run the latest version. I'll update it now and will let you know when to update.

lachlansimpson commented 7 years ago

No problem. I actually ran into the test using the

bpipe test.groovy fastq/*.fastq.gz

test

breons commented 7 years ago

I've updated both the bpipe and manual test to use the new version and have updated the documentation. Just update your current version and give it another go.

Cheers.

lachlansimpson commented 7 years ago

I get a new error:

[root@vmpr-res-head-node test]# bpipe errors

============================== Found 1 failed commands from run 12126 ==============================

============================================ Command 4 =============================================

Command    : STAR --runMode genomeGenerate --runThreadN 16 --genomeDir results/genome --genomeFastaFiles results/reference/fst_reference.fasta --limitGenomeGenerateRAM 1100000000 --genomeSAindexNbases 5
Exit Code  : 127
Output     : 

        bash: line 1: STAR: command not found
breons commented 7 years ago

So Clinker will require an aligner for that particular step (i'll add it to the example wiki page as well, I think it's only in the main). The default used is STAR (https://github.com/alexdobin/STAR) and will be required by the test and whenever you use the bpipe pipeline for other data.

If you install that, you should be good to finish.

Thanks again!

edit. It wasn't there, I'll add it in.

breons commented 7 years ago

To install STAR in your activated conda environment, just paste this in:

conda config --add channels conda-forge conda config --add channels defaults conda config --add channels r conda config --add channels bioconda

conda install -c bioconda star=2.5.3a

lachlansimpson commented 7 years ago

Ah, ok.

We don't use conda, and our researchers are using HiSat2.

I'm trying to install as a module, which we use for stability, versioning and reproducibility, so will need to make a few changes.

Will try to send some feedback through.

breons commented 7 years ago

Sure thing, we have run the tool on conda and with modules so you shouldn't have a problem. Let me know how you go!

lachlansimpson commented 7 years ago

This is the error I'm getting now

==================================== Stage prepare_plot (test) =====================================
bash: line 1: /config/binaries/clinker/1.2/Clinker/plotit/fst_plot_prep.sh: Permission denied
ERROR: stage prepare_plot failed: Command in stage prepare_plot failed with exit status = 1 : 

/config/binaries/clinker/1.2/Clinker/plotit/fst_plot_prep.sh BCR:ABL1 /config/binaries/clinker/1.2/Clinker/test/results/alignment/test/BCR_ABL1 BCR_ABL1 results/annotation results/alignment/test results/reference 
breons commented 7 years ago

Oh sure, you might just need to change the permission of Clinker/plotit/fst_plot_prep.sh to allow the pipeline to execute that file. Try a chmod 755 fst_plot_prep.sh.

lachlansimpson commented 7 years ago

I should explain. You have very much built this for a researcher to install in their home directory.

I work for 500 researchers and would prefer to install it somewhere central so that they can all use it, and the cluster we use can see it.

So there are various parts of the software that don't work in that (our) type of system just yet:

These are the two things I'm trying to work out so that my users can use clinker :)

breons commented 7 years ago

The way how we have set it up on our server is with a module with module dependencies (samtools, STAR, bpipe, java, etc.). That way researchers from any department with access to that server can execute it from a global source.

Effectively you just plonk clinker somewhere accessible by everybody and when they activate the module you just add the workflow/clinker.pipe to their path.

Then they should just need to run this snippet: bpipe -p out="/path/to/clinker/output" -p caller="/path/to/jaffa_results.csv" -p col="3,4,5,6" -p del="c" -p genome="19" -p print="true" -p fusions="BCR:ABL1" -p pdf_width="9" -p pdf_height="16" -p sizing="1,3,1,2,4,2" -p competitive="false" /path/to/clinker/clinker.pipe /path/to/*.fastq.gz

Where -out is wherever the user would like their clinker output to go.

lachlansimpson commented 7 years ago

Also: STAR needs to be version 2.5.3a for the test. I ran it against what we already had installed (2.5.2b) and it didn't work.

lachlansimpson commented 7 years ago
==================================== Stage prepare_plot (test) =====================================

BCR:ABL1
------------------------------------------
filtering BAM file for fusion of interest
filtering BAM file for reads with overhangs < 15 (noise reduction)
Creating ancillilary files
Index BAM files

===================================== Stage plot_fusion (test) =====================================
[1] "Plotting: BCR:ABL1"
[1] "------------------------------------------------------"
[1] "Libraries and ancillary files loaded. Creating Tracks."
  V4
1  2
2  4
Error in open.connection(con, open) : cannot open the connection
Calls: prepare ... queryForConnection -> connectionForResource -> open -> open.connection
In addition: Warning message:
In open.connection(con, open) :
  cannot open file 'results/alignment/test//coverage_rpm.bedgraph': No such file or directory
Execution halted
ERROR: stage plot_fusion failed: Command in stage plot_fusion failed with exit status = 1 : 

Rscript /config/binaries/clinker/1.2/Clinker/plotit/fst_plot.r results BCR:ABL1 results/alignment/test/BCR_ABL1 9 16 1,3,1,2,4,2 results/alignment/test test 

========================================= Pipeline Failed ==========================================

One or more parallel stages aborted. The following messages were reported:

-------------------------------------- plot_fusion  ( test )  --------------------------------------

Command in stage plot_fusion failed with exit status = 1 : 

Rscript /config/binaries/clinker/1.2/Clinker/plotit/fst_plot.r results BCR:ABL1 results/alignment/test/BCR_ABL1 9 16 1,3,1,2,4,2 results/alignment/test test

----------------------------------------------------------------------------------------------------
lachlansimpson commented 7 years ago

Ah! Great - thanks for that description, that's what I needed. Sorry, I'm not actually a bioinformatician/biologist. Just a regular sysadmin working in a hospital.

breons commented 7 years ago

No worries! Happy to help. I will also document this. Thanks for being a code explorer :).

What was the STAR error? We are running 2.5.2a.

That final error is weird, is results/alignment/test/coverage_rpm.bedgraph being created or is the script just having trouble finding it?

breons commented 7 years ago

I should mention that bpipe will "remember" what the last stage was due to the output, so it is probably safer to remove the results folder completely before executing again.

lachlansimpson commented 7 years ago

This was the previous fail that installing the newer star worked for

====================================================================================================
|                              Starting Pipeline at 2017-05-31 11:04                               |
====================================================================================================

======================================== Stage generate_fst ========================================

====================================== Stage star_genome_gen =======================================

===================================== Stage star_align (test) ======================================

===================================== Stage index_bams (test) ======================================

==================================== Stage prepare_plot (test) =====================================
bash: line 1: /config/binaries/clinker/1.2/Clinker/plotit/fst_plot_prep.sh: Permission denied
ERROR: stage prepare_plot failed: Command in stage prepare_plot failed with exit status = 1 : 

/config/binaries/clinker/1.2/Clinker/plotit/fst_plot_prep.sh BCR:ABL1 /config/binaries/clinker/1.2/Clinker/t
est/results/alignment/test/BCR_ABL1 BCR_ABL1 results/annotation results/alignment/test results/reference 

========================================= Pipeline Failed ==========================================

One or more parallel stages aborted. The following messages were reported:

------------------------------------- prepare_plot  ( test )  --------------------------------------

Command in stage prepare_plot failed with exit status = 1 : 

/config/binaries/clinker/1.2/Clinker/plotit/fst_plot_prep.sh BCR:ABL1 /config/binaries/clinker/1.2/Clinker/t
est/results/alignment/test/BCR_ABL1 BCR_ABL1 results/annotation results/alignment/test results/reference

----------------------------------------------------------------------------------------------------
breons commented 7 years ago

It seems to have passed the STAR stages without issue, but ran into a permissions issue on the bash script. If you change the fst_plot_prep.sh permissions, delete the results folder and start again with the older star version (if you still have it), I reckon that should work!

lachlansimpson commented 7 years ago

Yes - after I deleted the .bpipe and the results folder I actually got new errors with star 2.5.3a. Have re-deleted the directories, reloaded the older star and started again.

This is the new error:

==================================== Stage prepare_plot (test) =====================================

BCR:ABL1
------------------------------------------
filtering BAM file for fusion of interest
filtering BAM file for reads with overhangs < 15 (noise reduction)
Creating ancillilary files
Index BAM files

===================================== Stage plot_fusion (test) =====================================
[1] "Plotting: BCR:ABL1"
[1] "------------------------------------------------------"
[1] "Libraries and ancillary files loaded. Creating Tracks."
  V4
1  2
2  4
Error in open.connection(con, open) : cannot open the connection
Calls: prepare ... queryForConnection -> connectionForResource -> open -> open.connection
In addition: Warning message:
In open.connection(con, open) :
  cannot open file 'results/alignment/test//coverage_rpm.bedgraph': No such file or directory
Execution halted
ERROR: stage plot_fusion failed: Command in stage plot_fusion failed with exit status = 1 : 

Rscript /config/binaries/clinker/1.2/Clinker/plotit/fst_plot.r results BCR:ABL1 results/alignment/test/BCR_ABL1 9 16 1,3,1,2,4,2 results/alignment/test test 

========================================= Pipeline Failed ==========================================

One or more parallel stages aborted. The following messages were reported:

-------------------------------------- plot_fusion  ( test )  --------------------------------------

Command in stage plot_fusion failed with exit status = 1 : 

Rscript /config/binaries/clinker/1.2/Clinker/plotit/fst_plot.r results BCR:ABL1 results/alignment/test/BCR_ABL1 9 16 1,3,1,2,4,2 results/alignment/test test

----------------------------------------------------------------------------------------------------

Use 'bpipe errors' to see output from failed commands.
lachlansimpson commented 7 years ago

The bedgraph file doesn't exist?

breons commented 7 years ago

Is there a Signal.UniqueMultiple.str1.out.bg file in the alignments/test/ folder? Or nothing like that?

breons commented 7 years ago

Could you post the contents of your clinker.pipe file?

breons commented 7 years ago

Sorry, the test.goovy.

lachlansimpson commented 7 years ago

Yes there is:

[root@vmpr-res-head-node test]# ls  results/alignment/test/
Aligned.sortedByCoord.out.bam      Log.final.out     Signal.UniqueMultiple.str1.out.bg  _STARtmp
Aligned.sortedByCoord.out.bam.bai  Log.out           Signal.Unique.str1.out.bg
BCR_ABL1                           Log.progress.out  SJ.out.tab
lachlansimpson commented 7 years ago

my clinker.pipe is just the git checked out clinker.pipe.

I'm just running step 3 from "Using Bpipe (preferred method)" here https://github.com/Oshlack/Clinker/wiki/2.-Example

breons commented 7 years ago

OK! That's good, so STAR is working as intended judging by that folder layout.

Are the following two lines in your test.groovy (line 98-99)? exec "mv $output.dir/Signal.UniqueMultiple.str1.out.bg $output.dir/coverage_rpm.bedgraph" exec "mv $output.dir/Signal.Unique.str1.out.bg $output.dir/coverage_rpm_unique.bedgraph"

My local repository got nuked after a pull request yesterday, so I think that you might have an older version.

lachlansimpson commented 7 years ago

oh, my test.groovy is also the git checked out version

I've not made any changes.

lachlansimpson commented 7 years ago

yes, they are both there.

breons commented 7 years ago

Well that adds to the mystery...

If you rename "Signal.UniqueMultiple.str1.out.bg" to "coverage_rpm.bedgraph" and rerun the pipeline without deleting the results folder, does it carry on to the end?

I'm wondering whether that mv command isn't liked.. I can remove those commands and use the other file name, it's just a bit cleaner.

breons commented 7 years ago

Going on the above, I've updated the repo to add similar commands into the R script, the name change does have to occur due to GViz's requirements. Seems to be working fine, try a pull, delete the results and .bpipe folder and hopefully it goes along smoothly.

lachlansimpson commented 7 years ago

That works - went through without issue.

Cheers!

lachlansimpson commented 7 years ago

When the output says

===================================== Stage plot_fusion (test) =====================================
[1] "Plotting: BCR:ABL1"
[1] "------------------------------------------------------"
[1] "Libraries and ancillary files loaded. Creating Tracks."
[1] "Tracks created, printing PDF."
[1] "PDF created."

======================================== Pipeline Succeeded ========================================
13:10:04 MSG:  Finished at Wed May 31 13:10:04 AEST 2017
13:10:04 MSG:  Output is results/genome/Genome

What is "results/genome/Genome" and where is the pdf? Is it the file in results/plots/test?

breons commented 7 years ago

Fantastic! Glad I could help.

The pdf is located in results/plots/sample name/pdf name

That genome/Genome is an output placeholder that bpipe uses to make sure that a stage has been run. I need to find out a way to silence it (it's not necessary for it to be shown to the user), i'll put that in the next release.

breons commented 7 years ago

Also, thanks for being patient :).

lachlansimpson commented 7 years ago

No problems - thanks to you for your patience!

breons commented 7 years ago

Hi again,

I've made an update that will make Clinker easier to install/use in an environment such as yours.

Basically you just set an environment variable called "CLINKERDIR" that directs to the clinker directory (i.e. export CLINKERDIR="/path/to/clinkerdir"), users can just run the below snippet and it should all just click:

bpipe -p out="/path/to/clinker/output" -p caller="/path/to/jaffa_results.csv" -p col="3,4,5,6" -p del="c" -p genome="19" -p print="true" -p fusions="BCR:ABL1" -p pdf_width="9" -p pdf_height="16" -p sizing="1,3,1,2,4,2" -p competitive="false" $CLINKERDIR/workflow/clinker.pipe /path/to/*.fastq.gz

Hopefully that helps!