abyzovlab / CNVpytor

a python extension of CNVnator -- a tool for CNV analysis from depth-of-coverage by mapped reads
MIT License
178 stars 26 forks source link

deletions far more than duplications #213

Closed ZYongQi closed 6 months ago

ZYongQi commented 6 months ago

I used cnvpytor to call CNVs with a bin size 1000.I used python script to filter the output becasue I have no mask file.And the parameters are: size >1000 q0≥0 pN<0.5 dG≥100000. I got the .gff file from NCBI.And I did the annotation myself.The problem is deletions are far more than duplications in CNVs detected (100:1).I got reference from several papers. I guess the problem is the filtering parameters.Could you please give me some advice?

Best wishes!Thank you for any suggestions.

arpanda commented 6 months ago

You may consider the following strategies:

Thank you, Arijit

ZYongQi commented 6 months ago

You may consider the following strategies:

  • Adjust the bin size by increasing it, for example, from 1k to 5k.
  • Modify the size parameter to three times the bin size (e.g., from 1000 to 3 times the bin size).
  • Evaluate and potentially adjust the Q0 filtering parameter.
  • Identify repeat and gap regions manually, then filter out these regions.
  • If you have many samples, exclude regions exhibiting CNV in majority of the samples, For example: more than 90%.

Thank you, Arijit

I'll try it soon.Thank you for your precious asdvice,sincerely!