Closed adf-ncgr closed 3 years ago
The cowpea QTL markers from https://legumeinfo.org/data/public/Vigna_unguiculata/mixed.qtl.KF1G/ seem to match up with this file. Is that correct?
Note that all the start and end positions are equal, while they may be different in the GFF file we use for GWAS data (which seems backward),
The cowpea QTL markers from https://legumeinfo.org/data/public/Vigna_unguiculata/mixed.qtl.KF1G/ seem to match up with this file. Is that correct?
yes I believe so
Note that all the start and end positions are equal, while they may be different in the GFF file we use for GWAS data (which seems backward),
that's the gene annotation file, the mrk file is the one that has the SNPs (which have start=end because they are in fact "single nucleotide polymorphisms"; you must be using the same mrk file already for the GWAS snps (in addition to the annotation file for the gene display)
let me know if I misunderstood.
That is correct. I was confusing the mrk file with the annotations file, now it makes sense.
Some QTL marker files have a Distinction column with values like Flanking, Peak, or (blank), some do not.
From the QTL file specification, it looks like we create a QTL as follows.
that sounds correct to me. and you can ignore "Distinction" although we could imagine that in most cases if Flanking markers are given those should be the extreme values that you end up selecting.
How long (wide?) can a QTL be? Is 1-2 Mbp typical?
yes, they can be wide- several Mbp is not at all atypical
Typo in the QTL marker file
In the line
SC.Sanzi_x_Vita7 2_19309 Peak
the second separator is a space, it should be changed to a tab and re-gzipped.
Give me a second, I'll take care of it tout de suite
No rush, I am working off local copies.
the rush is that I'll immediately forget!
Hmm looks like @sammyjava has selfish permissions on this folder/files, so I will have to punt to him anyway.
Also, the markers in
do not match those in the GFF file. Ignoring these for now.
A simple-minded approach to displaying the QTLs in the Whole Genome chart is to map the QTL range to the Support Intervals feature, which we do not normally use. This leads to some mislabeling in the trait names on the chart and the column names in the Data Table, but serves as a proof of concept. The Chromosome chart works the same way. The height of each QTL is arbitrary (1.0, 1.1, 1.2 ... as needed to not hide any).
Next, I will try to get an idea of how easy it will be to adapt the Support Intervals code, compared to adding new code (or even rewriting ZBrowse from scratch). Also, should GWAS points and QTL bars go on the same chart? Currently, each has its own dataset and therefore goes on a separate chart.
That's great- as far as I can tell from looking at the ZBrowse publication, this may in fact have been the use they were intending for the intervals: "Ability to plot both SNPs and genetic intervals. We wanted users to be able to combine the results of quantitative trait locus mapping techniques with GWAS results."
Is this at a point where you could either put on dev-legfedorg or just push to a branch on github so I could get a sense for how it behaves in "hands-on" mode?
It took a while to clean up, but I checked in the changes to master and merged to dev-legfedorg.
It was not able to create the cowpea QTL file from the remote QTL and GFF files, so I copied my local one over. (To do: figure out why.)
It took a while to clean up, but I checked in the changes to master and merged to dev-legfedorg.
For what I guess was less than a day of effort I'd say the proof of concept is definitely on track; looks like linking in via the URL isn't quite working though. I was going to point Steven to an example of a cowpea QTL on Vu05 that matched some soybean GWAS on Gm18, but this link: http://dev.lis.ncgr.org:50003/shiny/ZBrowse/?tab=Chrom&datasets=Cowpea%20QTL&chr=Vu05&selected=907871&window=250000&datasets2=Soybean%20GWAS&chr2=Gm18&selected2=57035000&window2=250000&traits=Days%20to%20flowering;Flower%20color;Flowering%20time%20under%20long%20daylength%20at%20UCR-CES;Flowering%20time%20under%20short%20daylength%20at%20CVARS&genomicLinkage=true&neighbors=40&matched=20&intermediate=5&selectedGene=vigun.Vigun05g010900&relatedRegion=Gm18%2056.75-57.32%20Mbp
seems to take me to Vu01 instead. Let me know if you want me to file it as a separate issue...
Confirmed, though the problem is not specific to QTL data. This should be a separate issue, I suspect it has to do with the need to reset the chromosome view when the user changes an organism. (Except in this case where we do it programmatically through the URL.)
Also, the markers in
do not match those in the GFF file. Ignoring these for now.
ignoring for now seems like the right call; for some reason the publication from which @sammyjava must have taken these seems to be using non-standard identifiers for the markers. I think I have a bead on what's going on and can follow-up with the cowpea group to try to get some further clarity on it.
It was not able to create the cowpea QTL file from the remote QTL and GFF files, so I copied my local one over. (To do: figure out why.)
This is due to running R 3.5.2 on dev-legfedorg (and production), but R 4.0 locally.
Fixed and checked in. (commit d1a502c...)
Thanks for figuring out the issue on the R versions- I'll note that I have no problem (conceptually) with upgrading the site to use R 4.0 if that will help prevent future snags (and has no other obvious drawbacks)
There is some evidence that upgrading to R 4.0 would break clicking on a SNP (see issue #7). So I would hold off for now.
By the way, R is now up to 4.0.3 "Bunny-Wunnies Freak Out".
Did you know that the R release nicknaming strategy derives from a legume-focused publication?
Is Charlie Brown any relation to Thomas Browne?
There's clearly some interest in the quincunx: https://pbs.twimg.com/media/DKw_1uLUMAAf1-X.jpg
Hmm looks like @sammyjava has selfish permissions on this folder/files, so I will have to punt to him anyway.
I fixed this, right?
no, I think you fixed some gwas stuff elsewhere. I believe chgrp -R staff /usr/local/www/data/public/Vigna_unguiculata/mixed* /usr/local/www/data/public/Vigna_unguiculata/IT97K-499-35.gnm1.mrk.52FC will do the job needed for this one.
no, I think you fixed some gwas stuff elsewhere. I believe chgrp -R staff /usr/local/www/data/public/Vigna_unguiculata/mixed* /usr/local/www/data/public/Vigna_unguiculata/IT97K-499-35.gnm1.mrk.52FC will do the job needed for this one.
Done.
QTLs are typically represented as fairly wide intervals (typically, multi-megabases) but otherwise have much conceptual similarity to GWAS in that they represent a statistical association of a genomic region with some phenotypic trait. The methodologies for GWAS and QTL experiments/analyses are different, but this makes them somewhat complementary in nature and suggests we could benefit by integrating QTL data for display similar to what we currently do for GWAS only.
There may be some complications in terms of tabular displays (the column sets for GWAS and QTL are likely a bit different), but I am mostly thinking of the graphical displays at the moment. Something like an extra track that would display QTLs colored by traits as the GWAS SNPs are, but with range positions like genes have (no need for fwd/rev strand distinction for QTLs, but we'd want them to "tile" according to some simple algorithm).
placeholder issue for further discussion...