locus file problem - Githubissues

cwnag-c commented 1 year ago

Hello, I used the "ldblock" program to create breakpoints. The number of breakpoints were 2425, and I generated 2425 blocks using the "make.blocks()" function. However, the result is not consistent with the "LAVA_s2500_m25_f1_w200.blocks" file, which contains 2495 blocks. Additionally, I utilized the 1,000 Genomes phase 3 reference data.Which step did I make a mistake in? And I am extremely grateful for your assistance in resolving this issue. The command and log is as follows: command:/content/drive/MyDrive/other/lava-partitioning-main/ldblock \ /content/drive/MyDrive/LAVA/g1000_eur/g1000_eur \ -min-size 2500 \ -out /content/drive/MyDrive/LAVA/eur log:Reading /content/drive/MyDrive/LAVA/g1000_eur/g1000_eur.fam... found 503 individuals in data Reading /content/drive/MyDrive/LAVA/g1000_eur/g1000_eur.bim... found 22665064 SNPs (out of 22665064) Preparing file /content/drive/MyDrive/LAVA/g1000_eur/g1000_eur.bed...

Computing correlations... window = 200 MAF threshold = 0.01 retained 8809808 SNPs after filtering

Computing break points... minimum size = 2500 minimum proportion = 0.1 metric margin = 0.01 metric maximum = 0.25 found 2425 break points

Thank you, ChaoWang

cadeleeuw commented 1 year ago

Hi,

The ldblock program is meant to be run per chromosome, so you need to split the reference data into PLINK data sets separately for each chromosome first and then run them noe at a time. This should then give you the same block solution with 2495 blocks, once processed.

Best, Christiaan

cwnag-c commented 1 year ago

it works，thanks very much！！！！ @cadeleeuw

josefin-werme / LAVA

locus file problem #46