The regions that contain STR are unknown.

FAFUshiyan commented 2 years ago

Hi Li, This software must provide REPEAT_region.bed, I currently do not know those regions contain STR, is it not possible to use this software?

fangli80 commented 2 years ago

Hello ShiYan, Currently, NanoRepeat requires a reference sequence (consensus sequence) and the repeat location. (the repeat count in the reference sequence can be different from the sample) If you have the reference sequence but don't know there the repeat is , you can use Tandem Repeat Finder to scan the reference sequence and get the repeat region.

If you don't have the reference sequence (i.e., the consensus sequence is unknown), you can use a genome assembler (e.g., wtdbg2, Flye, metaFlye and canu) to generate the consensus sequence and use Tandem Repeat Finder to scan the sequence to get the repeat region.

In future versions, we will integrate these functions, but for now, please do these steps manually.

Best, Li

FAFUshiyan commented 2 years ago

Thank you very much for your reply.

WGLab / NanoRepeat

The regions that contain STR are unknown. #2