thomasvangurp / epiGBS

Code for working with epiGBS data
MIT License
10 stars 7 forks source link

Joined reads do not map with STAR #6

Open thomasvangurp opened 7 years ago

thomasvangurp commented 7 years ago

Using RNA-STAR as bisulfite read-mapper is very fast, even on moderate hardware. The bisulfite-mapping strategy was inspired by bwa-meth. Basically, both the reference and the reads are bisulfites converted, after mapping the original nucleotides are put back in place. This strategy work quite well. The options for joined reads need to be tweaked:


running:    STAR --runThreadN 5 --genomeDir /Volumes/5tb-deena/C101HW16120598/Lolium/AseI_NsiI/output_mapping/STAR_joined_watson --readFilesIn /tmp/tmp_LQLKoSTAR/watson_joinedZmU3H5.fastq  /tmp/tmp_LQLKoSTAR/watson_joinedy32Bz8.fastq --outSAMattributes NM MD AS --outSAMtype SAM --outFileNamePrefix /Volumes/5tb-deena/C101HW16120598/Lolium/AseI_NsiI/output_mapping/joined_watson --outReadsUnmapped Fastx --scoreGapATAC -2 --scoreGapNoncan -2 --outFilterMismatchNoverLmax 0.95--outFilterMatchNminOverLread 0.9 --scoreGap -4  --alignEndsType EndToEnd --alignSoftClipAtReferenceEnds No --outSAMorder PairedKeepInputOrder --outFilterMultimapNmax 1 --scoreInsOpen -1
stdout:
Jul 11 15:41:38 ..... Started STAR run
Jul 11 15:41:38 ..... Loading genome
Jul 11 15:41:38 ..... Started mapping
Jul 11 15:41:55 ..... Finished successfully

finished:   run STAR for watsontrand on joined reads

now starting:   write final log of STAR to normal log
running:    cat /Volumes/5tb-deena/C101HW16120598/Lolium/AseI_NsiI/output_mapping/joined_watsonLog.final.out 
stdout:
                                 Started job on |   Jul 11 15:41:38
                             Started mapping on |   Jul 11 15:41:38
                                    Finished on |   Jul 11 15:41:55
       Mapping speed, Million of reads per hour |   19.44

                          Number of input reads |   91794
                      Average input read length |   286
                                    UNIQUE READS:
                   Uniquely mapped reads number |   0
                        Uniquely mapped reads % |   0.00%
                          Average mapped length |   0.00
                       Number of splices: Total |   0
            Number of splices: Annotated (sjdb) |   0
                       Number of splices: GT/AG |   0
                       Number of splices: GC/AG |   0
                       Number of splices: AT/AC |   0
               Number of splices: Non-canonical |   0
                      Mismatch rate per base, % |   nan%
                         Deletion rate per base |   0.00%
                        Deletion average length |   0.00
                        Insertion rate per base |   0.00%
                       Insertion average length |   0.00
                             MULTI-MAPPING READS:
        Number of reads mapped to multiple loci |   0
             % of reads mapped to multiple loci |   0.00%
        Number of reads mapped to too many loci |   0
             % of reads mapped to too many loci |   0.00%
                                  UNMAPPED READS:
       % of reads unmapped: too many mismatches |   0.27%
                 % of reads unmapped: too short |   11.62%
                     % of reads unmapped: other |   88.11%
                                  CHIMERIC READS:
                       Number of chimeric reads |   0
                            % of chimeric reads |   0.00%

finished:   write final log of STAR to normal log```