pmelsted / pizzly

Fast fusion detection using kallisto
BSD 2-Clause "Simplified" License
80 stars 10 forks source link

Filtering criteria & downstream analysis #9

Closed leiendeckerlu closed 7 years ago

leiendeckerlu commented 7 years ago

Hi there,

MattBashton commented 7 years ago

For what it's worth I have implemented a simple downstream JSON flattening / gene location annotation / distance calculation (where on same chr) script in R here. I don't explicitly filter the output, but you can use of course sort the final tab delimited output via splitcount or paircount columns as you see fit.

leiendeckerlu commented 7 years ago

Very cool, will definitely check that out! Thanks Matt!

pmelsted commented 7 years ago

There are two approaches to filtering, one is to set a read count minimum. Pizzly does this by requiring two pairs supporting a fusion junction or one split read. You can also run the flattening script in the latest version to get a TSV table that is easier to filter on.

The other approach is to run kallisto to quantify the fusion transcripts and select those which have a decent TPM support. An example pipelins is in the Snakefile in the test directory.