Closed DSuveges closed 9 months ago
It works. When applying a double sized window to capture locus, we are getting double the amount of variants, while having the identical list of semi-indices:
+----------------+-----------+----------+------------------+
|variantId |short_locus|long_locus|increase |
+----------------+-----------+----------+------------------+
|10_63267850_A_T |342 |882 |2.5789473684210527|
|20_45916409_C_T |506 |1032 |2.039525691699605 |
|8_18415790_G_C |959 |1904 |1.9854014598540146|
|16_30907166_C_G |111 |274 |2.4684684684684686|
|7_73520180_A_G |142 |240 |1.6901408450704225|
|12_124002131_A_G|401 |844 |2.1047381546134662|
|12_57398797_C_T |188 |475 |2.526595744680851 |
|19_44911194_T_C |257 |675 |2.6264591439688716|
|2_164656581_T_C |454 |865 |1.9052863436123348|
|15_42391589_G_A |278 |768 |2.762589928057554 |
|6_31297713_T_C |1970 |2909 |1.4766497461928934|
|13_73991363_A_G |667 |1311 |1.9655172413793103|
|8_10826419_G_C |687 |1476 |2.148471615720524 |
|3_136207780_G_T |222 |444 |2.0 |
|2_226234464_C_T |553 |946 |1.7106690777576854|
|11_61802358_C_T |318 |652 |2.050314465408805 |
|8_125478730_A_T |479 |891 |1.860125260960334 |
|1_230169566_G_A |559 |1003 |1.7942754919499107|
|8_19986711_A_G |667 |1430 |2.143928035982009 |
|16_56970977_G_A |490 |911 |1.8591836734693878|
|19_19296909_T_C |245 |505 |2.061224489795918 |
|4_87109109_G_T |365 |760 |2.0821917808219177|
|2_21002409_C_T |549 |1164 |2.120218579234973 |
|22_38150026_T_C |318 |602 |1.8930817610062893|
|6_31979683_G_T |498 |1767 |3.5481927710843375|
|10_93079885_G_A |526 |1048 |1.9923954372623573|
|11_116778201_G_C|573 |1111 |1.9389179755671901|
|2_27508073_T_C |218 |471 |2.1605504587155964|
|15_58438954_G_C |726 |1250 |1.721763085399449 |
|1_62560271_G_T |314 |972 |3.0955414012738856|
|6_32543895_T_A |1474 |2663 |1.8066485753052917|
|15_43953733_A_T |273 |393 |1.4395604395604396|
|5_56565959_C_T |699 |1289 |1.844062947067239 |
|2_28121418_G_A |370 |829 |2.2405405405405405|
|5_157052312_G_C |449 |1003 |2.2338530066815143|
|7_72664689_C_T |169 |384 |2.272189349112426 |
+----------------+-----------+----------+------------------+
Currently when distance based clumping is applied on a summary stats dataset the same distance can be applied ot get the surrounding single point associations for pattern based colocalization. However it is desired to make it more flexible eg. apply a +/-500kbp window based clumping and get locus within +/-250kbp distance around semi indices.