bmansfeld / QTLseqr

QTLseqr is an R package for QTL mapping using NGS Bulk Segregant Analysis
64 stars 42 forks source link

trouble with plotQTLstats #28

Closed thuet206 closed 4 years ago

thuet206 commented 4 years ago

Hello!

I'm having some issues with the plotQTLStats function. I have used QTLseqr in the past on other datasets and haven't had any issues. I prepared my current dataset using the exact same scripts, however this time plotQTLStats is throwing me an error.

When I input:

f4 <- plotQTLStats(SNPset = G_filt_five_hr, var = "Gprime", plotThreshold = TRUE, q = 0.01) f4

I get the output:

Error in f(..., self = self) : Breaks and labels are different lengths In addition: Warning message: In x/limits[i] : longer object length is not a multiple of shorter object length

Any idea what may be causing this?

Thanks for the help, Tanner

bmansfeld commented 4 years ago

Hi Tanner, This seems to be an issue with the format_genomic() function that changes the formatting of the x-axis ticks in the plot. I'm not exactly sure why it's doing that - could be some strange discrepancy with the length of a specific chrom or something.

One thing to try is to isolate it using the subset parameter and see if it happens on all chroms or just a specific one.

The other thing to do is to just manually use ggplot to plot the data. Everything needed to plot the results should exist in the df you passed to he functions.

Let me know if you have any other questions.

Ben

thuet206 commented 4 years ago

Hi Ben,

Sorry for the slow reply. We have been scrambling to convert our teaching curriculum into an online format due to covid-19 related class cancellations.

Using the subset parameter I have been able to identify the problem chromosome. Interestingly enough I can get the plotQTLstats function to work when subsetting the problem chromosome by itself. For some reason it only breaks when trying to run all of the chromosomes together.

I have a theory, but do not know how to test it. The problem chromosome happens to have the largest QTL peak in my dataset. The peak is about 3x the size of the next largest peak. I have a hunch that this is somehow responsible for the error. The problem chromosome is an intermediate sized chromosome so I don't think its strictly an x-axis size problem. Do you have any ideas? I attempted to begin running through this manually with ggplot but I'm afraid this may be beyond my current R capabilities.

Thanks again for the help, Tanner

bmansfeld commented 4 years ago

hey Tanner. I'm sure it's been hectic - Stay healthy and safe!

Yeah it's strange it happens only on all of the Chrms together.. I wonder what it is...

try plotting with this: ggplot(data = G_filt_five_hr) + geom_line(aes(x = POS, y = Gprime)) Just as a basic plotting function, to see what happens. We can make it pretty later... Let me know how it worked,

thuet206 commented 4 years ago

Thanks Ben,

The manual ggplot code is more straightforward than I had imagined. I could get that to run, however I am not sure what to make of the output. It looks like all 16 chromosomes are possibly stacked on each other here:

image

bmansfeld commented 4 years ago

sorry my bad... wasn't thinking and didn't run the code before sending it...

you need to add the facets function which separates the plots by chromosome:

# Makes the plot:
p <- ggplot(data = G_filt_five_hr) + 
   geom_line(aes(x = POS, y = Gprime)) +
   facet_grid(~ CHROM, scales = "free_x", space = "free_x")

# prints the plot:
p

In order to add the gprime threshold you should first run the FDR threshold function and then extract the corresponding Gprime for that fdr (this should be available for use in the most recent version of QTLseqr so you might need to update or copy the code for the function from the github and then run in prior to using it): fdrT <- getFDRThreshold(G_filt_five_hr$pvalue, alpha = 0.01) and save that number then add GprimeT <- G_filt_five_hr[which(G_filt_five_hr$pvalue == fdrT), "Gprime"]

Finally add a horizontal line to your plots for that Gprime value:

p + geom_hline(
                aes(yintercept = GprimeT),
                color = "red",
                size = 1,
                alpha = 0.4
            )

This should work... If you find some time in all this craziness, I really recommend that you learn some ggplot. A good starting point will be using this data and messing around with the plots. it can be frustrating but also fun

thuet206 commented 4 years ago

Awesome, that worked. Now I can simply repeat those steps for the Δsnp-index graphs as well. If I find myself with some downtime I will definitely dive deeper into learning ggplot. Its been on my to-do list for a while.

Thank you Ben for all the help,

Tanner