get_lead_snps region_size parameter

totajuliusd / topr

topr is a collection of plotting functions for visualizing and exploring genetic association results. Association results from multiple phenotypes can be viewed simultaneously, over the entire genome (Manhattan plot) or in the more detailed regional view.

Other

49 stars 13 forks source link

CHR	POS	PVAL
chr1	1000	0.00001
chr1	2500	0.035
chr1	50000	0.8
chr1	97000	0.000001
chr1	101000	0.00002

CHR

POS

PVAL

chr1

1000

0.00001

chr1

2500

0.035

chr1

50000

0.8

chr1

97000

0.000001

chr1

101000

0.00002

Now all of the sudden `get_lead_snps` will return this:	CHR	POS	PVAL
chr1	97000	0.000001
chr1	101000	0.00002

Now all of the sudden get_lead_snps will return this:

CHR

POS

PVAL

chr1

97000

0.000001

chr1

101000

0.00002

Hi and thank you for your message. I am aware of the issue you are pointing out, but for the work I have done so far, this functionality has sufficed with some workarounds.

I usually use the manhattan function to get a quick overview of the association peaks and their nearest genes. When I encounter cases like you describe, where two markers that are very close to one another get labelled, I usually solve it by increasing the region_size argument. If that doesnt solve it, I create a separate list containing the markers I want to label to have more control over what gets displayed. For example,

lead.snps <- get_lead_snps() %>% annotate_with_nearest_gene()
#remove or add snps to the lead.snps list
#and then plot
manhattan(list(CD_UKBB,  lead.snps), annotate=c(1e-100, 1e-9), color=c("darkblue","darkblue"))

Note that I set the first value in the vector I pass to the annotate argument to 1e-100, so that nothing gets annotated in the CD_UKBB datasets. I then assign the same color (darkblue) to both datasets so all datapoints look the same.

Having said that, package contributions that add and improve functionality are always more than welcome!

totajuliusd / topr

get_lead_snps region_size parameter #65