hsinnan75 / MapCaller

MapCaller – An efficient and versatile approach for short-read alignment and variant detection in high-throughput sequenced genomes
MIT License
29 stars 5 forks source link

Different mapping with different -t threads ? #49

Open tseemann opened 4 years ago

tseemann commented 4 years ago

Is MapCaller deterministic? If i run with different number of threads I get different mappings. And the more threads, the more mappings (approximately).

    -t        Result
-----------------------------------------------------
     1        812958 ( 90.05%) reads are mapped properly.
     2        813000 ( 90.06%) reads are mapped properly.
     3        813028 ( 90.06%) reads are mapped properly.
     4        813036 ( 90.06%) reads are mapped properly.
     5        813062 ( 90.07%) reads are mapped properly.
     6        813074 ( 90.07%) reads are mapped properly.
     7        813095 ( 90.07%) reads are mapped properly.
     8        813118 ( 90.07%) reads are mapped properly.
     9        813125 ( 90.07%) reads are mapped properly.
    10        813137 ( 90.07%) reads are mapped properly.
    11        813141 ( 90.07%) reads are mapped properly.
    12        813153 ( 90.08%) reads are mapped properly.
    13        813155 ( 90.08%) reads are mapped properly.
    14        813140 ( 90.07%) reads are mapped properly.
    15        813174 ( 90.08%) reads are mapped properly.
    16        813193 ( 90.08%) reads are mapped properly.
....
    67        813160 ( 90.08%) reads are mapped properly.
    68        813157 ( 90.08%) reads are mapped properly.
    69        813180 ( 90.08%) reads are mapped properly.
    70        813134 ( 90.07%) reads are mapped properly.
    71        813207 ( 90.08%) reads are mapped properly.
    72        813171 ( 90.08%) reads are mapped properly.
hsinnan75 commented 4 years ago

MapCaller is a deterministic method if it is running with a single thread. MapCaller will rescue unpaired alignments when it collects enough data (>1000 unique paired alignments) to estimate the actual fragment size. However, when you use different number of threads to run MapCaller, some paired-ends reads could not be rescued since MapCaller did not collect enough data to estimate the fragment size. If all reads are mapped at the same time, none of unique paired-end alignments is used to estimate the fragment size, then there will be no rescued alignments.

tseemann commented 4 years ago

Is this parameter relevant?
-size sequencing fragment size [500]

hsinnan75 commented 4 years ago

MapCaller can predict the fragment size during the read mapping. If you specify the fragment size, MapCaller can use the value to remove ambiguous alignments before it predicts the fragment size.