Ahhgust / STRaitRazor

MIT License
11 stars 6 forks source link

The output format of STRaitRazor v3 is different from STRaitRazorv2s #1

Closed ichobits closed 6 years ago

ichobits commented 7 years ago

Condition: Use the same raw data. Fastq format.

Softwares: STRaitRazorv2s STRaitRazorv3 (github)

Command: STRaitRazorv2s: ./STRaitRazor.pl -dir ./project/results/ -fastq ./project/data/R708-A506_S14_L001_R1.fastq -fastq ./project/data/R708-A506_S14_L001_R2.fastq -sampleNum PE_R708-A506_S14_L001_ALL -typeselection ALL -locusConfig /project/scripts/customloci.config

STRaitRazorv3 ./str8rzr -c Forenseq.config R708-A506_S14_L001_R1.fastq R708-A506_S14_L001_R2.fastq > test_F.txt

Results: STRaitRazorv2s:

marker name DoC seq Reverse Complement Repeats
CSF1PO:9 56 bases AAGATAGATAGATTAGATAGATAGATAGATAGATAGATAGATAGATAGATAGGAAG CTTCCTATCTATCTATCTATCTATCTATCTATCTATCTATCTAATCTATCTATCTT 37
CSF1PO:10 60 bases AAGATAGATAGATTAGATAGATAGATAGATAGATAGATAGATAGATAGATAGATAGGAAG CTTCCTATCTATCTATCTATCTATCTATCTATCTATCTATCTATCTAATCTATCTATCTT 839

Amelogenin:Y 69 bases 1 CSF1PO:9 56 bases 37 CSF1PO:10 60 bases 839

STRaitRazorv3

marker name DoC seq ??? ???
CSF1PO:9 56 bases CTTCCTATCTATCTATCTATCTATCTATCTATCTATCTATCTAATCTATCTATCTT 0 37
CSF1PO:10 60 bases CTTCCTATCTATCTATCTATCTATCTATCTATCTATCTATCTATCTAATCTATCTATCTT 0 839

Now: I used the STRR-v2 output files (allsequences.txt) to correctly import "STRait Razor v2s Analysis.xlsm"(from software v2). But i used the STRR-v3 output files (test_F.txt) to import "STRait Razor v2s Analysis.xlsm". It will show the error window.

The format of STRR-v3 is accordence with STRR-v2 ?

Ahhgust commented 6 years ago

I'm very sorry for getting back to you so late on this!

Yes, the format changed, with the two counts reported being the number of reads on the + and - strand (relative to the config) for a given haplotype. You can turn this off with the -n flag, which I believe creates output that is backwards compatible with the previous versions, though this is lossy (that is you're losing information on capturing different strands). A better solution is to use v3 coupled with the v3 workbook (for the workbook see the first link on the splash page, also available here: https://www.dropbox.com/s/t3n0d2h6od0qek2/STRait%20Razor%20Analysis%20v3.xlsm?dl=1 )

By the way, the underlying algorithm itself changed as well, so the read-counts need not perfectly correspond between say v2s and v3 (though in practice this difference is small).

My apologies for letting this correspondence slip through the cracks! And if you encounter any other issues please let me know.

-August