rajewsky-lab / find_circ2

Find circRNAs in RNA-seq data.
GNU General Public License v3.0
13 stars 6 forks source link

first version of synthetic read generation #1

Closed mschilli87 closed 8 years ago

mschilli87 commented 8 years ago

It's horribly slow but it works. I manually checked many artifical reads against the genome browser and sequence, coordinates & annotation matched in all cases. So far I used Evgenia's data (so C. elegans).

Just run ln -s /data/rajewsky/home/mschilli/repo/projects/celegans_evgenia/EA_cel10T.EA_cel10T.WBcel235_81.ribosomal_transcripts.unmapped.wbcel235.sorted.unmapped.anchors.wbcel235.circs.bed.gz wbcel235.circs.bed.gz before running make synthetic_reads.R1.fa.gz or make synthetic_reads.R1.fa.gz since I cannot share these data publicly.

@marvin-jens: If you don't want a 2nd reference in the tests maybe we can discuss what to base it on to avoid the overhad the the test runs. I know that for unit tests it has to be much faster & cleaner. The current code code be improved by me or I could spent some time (next week?) to re-do this using byo to honor your legacy and learn to appreaciate what you did there. ;)