Closed zhoudreames closed 1 year ago
@mobinasri I am so sorry to disturb you, can you help me ? Thanks~
@zhoudreames Sorry for the late response. In general it is fine to have empty duplication bed file. But it seems you only used whole-genome coverage distribution which may lack sensitivity in some cases. The reason is explained in this link https://github.com/mobinasri/flagger/blob/main/docs/flagger/README.md#1-window-specific-models
It is recommended to perform all the correction steps. One of them is splitting the assembly into windows of length 5-10Mb and then run fit_gmm.py
on each window separately. It can help the model to detect false duplication if the duplication rate is low and not visible in the whole-genome coverage distribution.
It is easier to use the WDL files (Flagger_Preprocess and Flagger.wdl) provided for this aim.
Following your flagger pipline, I finally gain the four files, the haploid.bed(99.55%),error.bed(0.013%), collosed.bed( 0.437%) and duplicated.bed(0%). The duplicated.bed file here is completely empty. I don't know what the problem is, how to deal with it ? Thanks This my code: