Closed chanhee22kim closed 1 year ago
Hi @chanhee22kim,
tldr; Yes, it's safe to ignore these warnings unless your sequencing data is VERY deep. This is to reduce memory requirements.
STRling is set to skip over tandem repeat regions supported by more than 65,535 reads (uint16.high.int
). This prevents a memory blowout. Unless you are doing incredibly high-depth sequencing most of the genome should be well under this threshold, but we do expect to see some of these warnings in most samples. It is common for a number of regions to have very high depth, for example, seg dups, transposable elements, telomeres and centromeres. Typically, any variant call in an extremely high-depth region such as this is suspect. So STRling doesn't try to make a call in these regions, and it should be safe to skip over them for most applications.
I understood what you mentioned. Thank you for your kind response.
Thank you very much!
Hello,
I run below command for joint calling which binds several bin file
cat ../AD_WGS_batch1-7_STRling.txt | xargs -L 2000 strling merge -f ../../resources/chm13v2.0.fa --output-prefix ~/WGS/AD_STR/outputs/joint_bin/ > ~/WGS/AD_STR/outputs/joint_bin/str_joint_log.txt 2>&1
The command finished with logs below.
I'm asking if it is okay to continue the skipping warning, or should I check other options to solve this problem.
I used 1,824 samples with strling.
Thank you for providing a great tool.
Best regards, Chan
[log]
strling version: 0.5.2 [strling] read 815645 STR reads from file: WGS_0001.bin [strling] read 501777 STR reads from file: WGS_0002.bin ... [strling] read 102666 STR reads from file: WGS_1836.bin [strling] read 113928 STR reads from file: WGS_1837.bin [strling] read 123117 STR reads from file: WGS_1838.bin More than 65535 reads in cluster with first read:(tid: 3, position: 169960672, repeat: ['G', '\x00', '\x00', '\x00', '\x00', '\x00'], flag: PAIRED,READ2,DUP, split: right, mapping_quality: 47, repeat_count: 70, align_length: 70, qname: "20") skipping More than 65535 reads in cluster with first read:(tid: 24, position: 10263, repeat: ['G', '\x00', '\x00', '\x00', '\x00', '\x00'], flag: PAIRED,MREVERSE,READ2, split: none, mapping_quality: 43, repeat_count: 150, align_length: 150, qname: "428") skipping More than 65535 reads in cluster with first read:(tid: 24, position: 15757, repeat: ['G', '\x00', '\x00', '\x00', '\x00', '\x00'], flag: PAIRED,MREVERSE,READ2, split: none, mapping_quality: 60, repeat_count: 150, align_length: 150, qname: "113") skipping More than 65535 reads in cluster with first read:(tid: 1, position: 181396416, repeat: ['A', '\x00', '\x00', '\x00', '\x00', '\x00'], flag: PAIRED,READ2, split: none_right, mapping_quality: 0, repeat_count: 124, align_length: 150, qname: "49") skipping More than 65535 reads in cluster with first read:(tid: 3, position: 169960503, repeat: ['A', '\x00', '\x00', '\x00', '\x00', '\x00'], flag: PAIRED,PROPER_PAIR,MREVERSE,READ1, split: none, mapping_quality: 54, repeat_count: 128, align_length: 150, qname: "1138") skipping More than 65535 reads in cluster with first read:(tid: 22, position: 149830784, repeat: ['G', '\x00', '\x00', '\x00', '\x00', '\x00'], flag: PAIRED,PROPER_PAIR,REVERSE,READ1, split: none, mapping_quality: 60, repeat_count: 142, align_length: 150, qname: "754") skipping More than 65535 reads in cluster with first read:(tid: 6, position: 86499722, repeat: ['G', '\x00', '\x00', '\x00', '\x00', '\x00'], flag: PAIRED,MREVERSE,READ2, split: none, mapping_quality: 57, repeat_count: 144, align_length: 150, qname: "233") skipping More than 65535 reads in cluster with first read:(tid: 24, position: 15398, repeat: ['C', '\x00', '\x00', '\x00', '\x00', '\x00'], flag: PAIRED,READ2, split: none, mapping_quality: 60, repeat_count: 150, align_length: 150, qname: "97") skipping More than 65535 reads in cluster with first read:(tid: 24, position: 15965, repeat: ['C', '\x00', '\x00', '\x00', '\x00', '\x00'], flag: PAIRED,READ2, split: none, mapping_quality: 60, repeat_count: 150, align_length: 150, qname: "71") skipping More than 65535 reads in cluster with first read:(tid: 6, position: 86500001, repeat: ['C', '\x00', '\x00', '\x00', '\x00', '\x00'], flag: PAIRED,READ2, split: none, mapping_quality: 60, repeat_count: 122, align_length: 150, qname: "1715") skipping