PacificBiosciences / HiFiCNV

Copy number variant caller and depth visualization utility for PacBio HiFi reads
Other
37 stars 4 forks source link

Use CHM13v2.0.fa as reference #27

Closed DayTimeMouse closed 3 months ago

DayTimeMouse commented 4 months ago

Hi,

I want to use CHM13v2.0 as reference, how to get expected_cn.XXXX.bed and cnv.excluded_regions.XXXX.bed? UCSC does not auxiliary data files.

--maf vcf.gz is generated by deepvariant, whether to use the vcf site of filter=pass.

Best wishes.

holtjma commented 4 months ago

Hello @DayTimeMouse,

I want to use CHM13v2.0 as reference, how to get expected_cn.XXXX.bed and cnv.excluded_regions.XXXX.bed? UCSC does not auxiliary data files.

The expected CN files we provide are currently just annotating the PAR regions on X and Y and provided the expected copy number for each expected sample type (XX or XY). I'm not sure where this information is for CHM13, but if you have access to it, you should be able to make one with relative ease.

The exclude regions are more difficult, we generated those by looking at a cohort that was run on hg38 as described here: https://github.com/PacificBiosciences/HiFiCNV/blob/main/docs/aux_data.md. If you want one for CHM13, you'll likely need to create one yourself.

--maf vcf.gz is generated by deepvariant, whether to use the vcf site of filter=pass.

Either should work, this is just an auxiliary track that you may find useful for interpretation.

Matt

holtjma commented 3 months ago

Closing due to inactivity, feel free to re-open if you have follow up questions.