Closed kscott-1 closed 5 months ago
Hi and thank you for your message. I am aware of the issue you are pointing out, but for the work I have done so far, this functionality has sufficed with some workarounds.
I usually use the manhattan function to get a quick overview of the association peaks and their nearest genes. When I encounter cases like you describe, where two markers that are very close to one another get labelled, I usually solve it by increasing the region_size argument. If that doesnt solve it, I create a separate list containing the markers I want to label to have more control over what gets displayed. For example,
lead.snps <- get_lead_snps() %>% annotate_with_nearest_gene()
#remove or add snps to the lead.snps list
#and then plot
manhattan(list(CD_UKBB, lead.snps), annotate=c(1e-100, 1e-9), color=c("darkblue","darkblue"))
Note that I set the first value in the vector I pass to the annotate argument to 1e-100, so that nothing gets annotated in the CD_UKBB datasets. I then assign the same color (darkblue) to both datasets so all datapoints look the same.
Having said that, package contributions that add and improve functionality are always more than welcome!
Hi there, great package you've built here. I have a bit of an issue with how the
region_size
parameter is designed to work. The way it is set up, the lead snps result solely from blocks of the region size across the genome. I do not feel this is the best approach.This allows for snps to be any arbitrary number of bp away from each other. If region_size=100000 with block 1 -> pos=1-100000 and block2 -> pos=100001-200000, the function will look for snps passing the sig threshold and then filter the lead marker in each block. The problem is then say you had results like these:
get_lead_snps
will return this:If I specify a region size of 100000 bp, I would not want any markers within +/-50000 bp of either side of a resultant marker to also be output. This is a direct problem when annotating a manhattan because the annotate function directly calls the get_lead_snps function with this region_size logic.