Closed gunns2 closed 7 months ago
Hi Sophie,
Thank you for reaching out and for using our software! While I'm not familiar with the detailed structure of the Hail .bgz format, I can provide information about the plink binary4 file format, which is relatively straightforward. In the plink binary4 format, each number in a "full" LD matrix is stored using 4 bytes as it stored in memory.
I've uploaded a C++ script for converting a "full" LD matrix to the binary4 format. You can find it here: utilities/convert2bin.cpp
Once compiled, you can use it as follows:
zcat $path_to_the_gz_full_LD_matrix | ./convert2bin $path_to_the_output_file
If you have any further questions or need assistance, please feel free to ask. I'm here to help!
My best
Kai Yuan
Hi,
This worked great, thank you so much!
Sophie
Hello,
Thanks so much for another great stat gen package! I'm hoping to run SuSiEx with LD reference taken from the UKBB LD block matrices, which are precomputed. I already have LD matrices for the loci I want to analyze in regular text format, as well as the upper triangular hail .bgz format. I think what makes the most sense is to try to convert the LD files that I have into the same format as the plink binary4 files that SuSiEx utilizes.
Do you have any insight into how these binary4 files are formatted and how to best go about converting? This seems like it should be possible but I'm running into some issues trying to figure out how to convert the files.
thanks so much!
Sophie