bvilhjal / ldpred

MIT License
95 stars 58 forks source link

# gm_ld_radius option / suggested edits #37

Open ilarsf opened 5 years ago

ilarsf commented 5 years ago

Hi Bjarni,

Thank you for developing and for continuing to improve LDpred. I was experimenting with the --gm_ld_radius option of coord_genotypes.py after interpolating the genetic map of my LD reference with https://github.com/joepickrell/1000-genomes-genetic-maps/blob/master/scripts/interpolate_maps.py

I ran into some issues and applied the following code changes:

  1. Line 1295 in coord_genotypes.py: Changed genetic_map.append(l[0]) to genetic_map.append(float(l[2])) Before the script was storing the marker ID but not the actual CM value. (Without the float() it was storing the value as a character which caused another issue.)

  2. Line 56 in ld.py Changed while stop_i > 0 and max_cm < curr_cm + gm_ld_radius: to while stop_i > 0 and stop_i + 1 < len(gm) and max_cm < curr_cm + gm_ld_radius: (to avoid issues with the last variant of a genome)

  3. Line 63 in ld.py Changed curr_ws = stop_i - start_i to curr_ws = stop_i - start_i + 1 to accommodate variants with no other variant within a certain gm_ld_radius

  4. Line 82 in ld.py Changed avg_window_size = sp.mean(window_sizes) to avg_window_size = int(sp.mean(window_sizes)) to change the window_size to the required integer.

Seems to run fine with these changes.

Is there anything else I need to consider when I use the --gm_ld_radius option?

Thank you! Lars

bvilhjal commented 5 years ago

Thanks a lot Lars, I will try to incorporate these changes into the code in the coming weeks.