UCSC-Treehouse / treehouse-fusion

Treehouse fusion calling pipeline
0 stars 0 forks source link

Select fusions where either gene is in gene-list #1

Open jpfeil opened 5 years ago

jpfeil commented 5 years ago

The pipeline filters genes were one of the genes is not in the gene-list. This may miss some fusions of interest.

hbeale commented 5 years ago

Thanks, Jacob! Will you please also add the following genes to the gene list?

aggregatedCancerGenes_2018-01-04_12.20.15PM.txt

Details on the gene list are here: https://github.com/UCSC-Treehouse/analysis-methods/blob/master/gene_lists/gene_list_readme.md

jpfeil commented 5 years ago

@e-t-k I want to run the updated pipeline on a few samples. Can you please copy these fastq files to /scratch/jpfeil/fusion? Thanks!

TH01_0122_S01 TH01_0129_S01 TH01_0132_S01

e-t-k commented 5 years ago

@jpfeil per discussion, TH34_1455_S01 fastqs have been copied to: /scratch/ekephart/fusion/TH34_1455_S01 on razzmatazz.prism

I did not have permissions to write directly to your /scratch/jpfeil/fusion dir. So you may mv my copy to your dir instead.

jpfeil commented 5 years ago

The latest version of the pipeline finds the TH34_1455_S01 EWSR1--PATZ1 fusion.

jpfeil commented 5 years ago

@e-t-k The digest for the latest version is sha256:9e5ce87104287205f3ece4773296b219c71974d48f1e0b92de2fc629168479a2

e-t-k commented 5 years ago

@jpfeil could you double-check that the SHA is correct? I don't see that https://hub.docker.com/r/ucsctreehouse/fusion has been updated recently; and I'm unable to pull the image by that sha:

$ docker run --rm ucsctreehouse/fusion@sha256:9e5ce87104287205f3ece4773296b219c71974d48f1e0b92de2fc629168479a2
Unable to find image 'ucsctreehouse/fusion@sha256:9e5ce87104287205f3ece4773296b219c71974d48f1e0b92de2fc629168479a2' locally
docker: Error response from daemon: manifest for ucsctreehouse/fusion@sha256:9e5ce87104287205f3ece4773296b219c71974d48f1e0b92de2fc629168479a2 not found.
See 'docker run --help'.
jpfeil commented 5 years ago

Sorry, @e-t-k I pushed to the wrong docker hub. Try this one:

docker run --rm ucsctreehouse/fusion@sha256:633adf491aac8c216df2855e47a2ffd55c9af6c5f646ae0944a4273f33caffe0

e-t-k commented 5 years ago

@jpfeil Thanks for the new SHA.

I've just done a test run on the pipelines' test FASTQs and it has errored out. The key line seems to be ERROR: didn't find at least 1000 BAM records properly ordered along a single scaffold. at /opt/trinityrnaseq-Trinity-v2.4.0/util/support_scripts/ensure_coord_sorted_sam.pl and full log: fusion-log-error.txt

Note that previously, these test files didn't have any fusions that passed the gene list, so FusionInspector was skipped entirely.

Is this something I can resolve in the fab wrapper, or do you need to add a check to the script? And let me know if you need access to any of the intermediate files.

jpfeil commented 5 years ago

Thanks, @e-t-k are you using the --run-fusion-inspector flag? Try removing it.

e-t-k commented 5 years ago

@jpfeil Yes, I am using --run_fusion_inspector. But if I remove it from the pipelines Makefile, then we won't get FusionInspector results at all for any sample; are those actually not important for you?

Some more info: The test sample has 1 fusion in star-fusion.fusion_candidates.final.in_genelist.abridged, BRD4--RFX1

Docker run command:

        docker run --rm \
                -v $(shell pwd)/outputs:/data/outputs \
                -v $(shell pwd)/samples:/data/samples \
                -v $(shell pwd)/references:/data/references \
                ucsctreehouse/fusion@sha256:633adf491aac8c216df2855e47a2ffd55c9af6c5f646ae0944a4273f33caffe0 \
                        --left_fq $(R1) \
                        --right_fq $(R2) \
                        --output_dir outputs/fusions \
                        --CPU `nproc` \
                        --genome_lib_dir references/STARFusion-GRCh38gencode23 \
                        --run_fusion_inspector
e-t-k commented 5 years ago

@jpfeil After some thought, here's my proposal. What do you think of:

The downside of this is that if star or FusionInspector fail for a "legitimate" reason, it will be less obvious and will be more effort to debug; but I haven't seen much evidence of that happening in all the samples we've run previously.

jpfeil commented 5 years ago

@e-t-k I think it's cleaner if the fusion pipeline fails gracefully instead of blowing up. I'll modify the code to write the error message to a log file.

e-t-k commented 5 years ago

@jpfeil Perfect; that way it will be continue to be obvious if it does blow up. So just let me know whenever you have the new SHA and I'll take it from there :-)

jpfeil commented 5 years ago

@e-t-k I'm not able to reproduce the error, but I added code to save an error log instead of raising the error. Let me know if this version causes the same problem:

sha256:827aa24b9e3711d56544c9df11dc990c4cf9cd7fca7bd84cca481c0463ea7434