legumeinfo / legumemine

An InterMine which contains multiple legumes
GNU Lesser General Public License v3.0
0 stars 0 forks source link

Proposed GWAS data model #32

Closed sammyjava closed 4 years ago

sammyjava commented 4 years ago

Here is the data model that I propose for GWAS (already implemented in SoyMine, but not using LIS datastore files, which as yet don't exist). If this model is OK, I'll create compatible files for loading this model and upload them to the LIS private datastore. @cann0010 @adf-ncgr

<?xml version="1.0"?>
<classes>

  <!-- GWAS extends Annotatable to support publications -->  
  <class name="GWAS" extends="Annotatable" is-interface="true">
    <collection name="results" referenced-type="GWASResult" reverse-reference="study"/>
    <collection name="QTLs" referenced-type="QTL" reverse-reference="GWAS"/>
  </class>

  <!-- GWASResult connects a marker to a phenotype -->
  <class name="GWASResult" is-interface="true">
    <attribute name="pValue" type="java.lang.Double"/>
    <reference name="phenotype" referenced-type="Phenotype"/>
    <reference name="study" referenced-type="GWAS" reverse-reference="results"/>
    <reference name="marker" referenced-type="GeneticMarker" reverse-reference="gwasResults"/>
    <collection name="associatedGenes" referenced-type="Gene"/>
  </class>

  <!-- Phenotype associates a phenotype with its measured values -->
  <class name="Phenotype" extends="Annotatable" is-interface="true" term="http://semanticscience.org/resource/SIO_010056">
    <attribute name="name" type="java.lang.String" term="http://edamontology.org/data_3275"/>
    <collection name="gwasResults" referenced-type="GWASResult" reverse-reference="phenotype"/>
    <collection name="phenotypeValues" referenced-type="PhenotypeValue" reverse-reference="phenotype"/>
  </class>

  <!-- PhenotypeValue can be all sorts of measurements, and refer to a Strain -->
  <class name="PhenotypeValue" is-interface="true">
    <reference name="phenotype" referenced-type="Phenotype" reverse-reference="phenotypeValues"/>
    <reference name="strain" referenced-type="Strain"/>
    <attribute name="textValue" type="java.lang.String"/>
    <attribute name="numericValue" type="java.lang.Double"/>
    <attribute name="booleanValue" type="java.lang.Boolean"/>
  </class>

  <!-- GeneticMarker.type will often be "SNP", but could be another type -->
  <class name="GeneticMarker" extends="SequenceFeature" is-interface="true">
    <attribute name="type" type="java.lang.String"/>
    <collection name="gwasResults" referenced-type="GWASResult" reverse-reference="marker"/>
  </class>

</classes>
ekcannon commented 4 years ago

Hi Sam,

I'd love to solve this problem for MaizeGDB too and like your simple approach. One thing that jumped out when I looked at your GWAS schema is that I'm not sure how to get the specific phenotype value for the phenotype reported by a single GWAS result...unless you are just reporting that this result consists of saying that any associated markers and gene models affect this phenotype, regardless of its value.

My XML is a little rusty so I may be misreading the schema: you don't include a definition of 'Gene'. I assume it's just a string, but doesn't it need to be defined as such?

Also, formal phenotype/trait ontologies should be included in the phenotype record, as painful as they are to work with, perhaps as an optional field.

Ethy


From: Sam Hokin notifications@github.com Sent: Tuesday, March 24, 2020 8:56 AM To: legumeinfo/legumemine legumemine@noreply.github.com Cc: Subscribed subscribed@noreply.github.com Subject: [legumeinfo/legumemine] Proposed GWAS data model (#32)

Here is the data model that I propose for GWAS (already implemented in SoyMine, but not using LIS datastore files, which as yet don't exist). If this model is OK, I'll create compatible files for loading this model and upload them to the LIS private datastore.

<?xml version="1.0"?>

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/legumeinfo/legumemine/issues/32, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AA2AJIJ5SH44REKI4MCE2G3RJC3XXANCNFSM4LSVGZIQ.

sammyjava commented 4 years ago

Oh, there's all sorts of stuff in the InterMine data model that isn't in the specific XML for the GWAS sub-model that I showed. InterMine is highly relational. Gene is a subclass of SequenceFeature, with all sorts of extra attributes, references and collections.

Phenotype is a standalone object. It may have a name like "seed weight." It has nothing specifically to do with any GWAS but may have an ontology term associated with it, thus it extends "Annotatable." (Which is another class not shown in the GWAS data model because it's a core class.) When a GWAS says something about a phenotype, I have to create the Phenotype object, or use one that has already been created (this is "merging", the hardest part about building a mine, to avoid duplicates).

So a full mine data model is quite huge and you're welcome to look at it. There's a spot on every mine where you can drill through it:

https://mines.legumeinfo.org/soymine/tree.do

It's hierarchical, so most stuff falls under Annotatable, and then BioEntity and then SequenceFeature for genomic features.

As for phenotype ontologies, I already load Plant Ontology terms with the Phenotypes, by curating them myself, and I've included Soybean and Crop ontologies as well. You can associate any number of ontology terms with an Annotatable object. If you click on "Phenotype" on the above page, you'll go to the Query Builder for Phenotype, which shows all of its attributes, references and collections; one of those collections is Ontology Annotations.

  [ ] Phenotype
     -   [ ] Primary Identifier
     -   [+] Gwas Results
     -   [+] Ontology Annotations
     -   [+] Phenotype Values
     -   [+] Publications
     -   [+] QTLs
adf-ncgr commented 4 years ago

I seem to recall we ran into awkwardness in the interface due to this quasi-union representation of different types in PhenotypeValue: <attribute name="textValue" type="java.lang.String"/> <attribute name="numericValue" type="java.lang.Double"/> <attribute name="booleanValue" type="java.lang.Boolean"/> did we decide using some sort of inheritance scheme was impractical for some reason? I have a vague memory that the sorting of the matrix display was somehow involved in your dislike of that approach

adf-ncgr commented 4 years ago

also, can you clarify whether the associatedGenes is something that: -will be populated automagically (e.g. these are the nearest genes within <=5 kb of the marker) OR -is just a place for GWAS authors to assert something about genes they think are candidates (for whatever reason?)

sammyjava commented 4 years ago

Re. PhenotypeValue I think I tried some other options and decided having three different attributes worked best. I show them all on the report pages. We can revisit that, it's an under-the-hood thing, not really impacting the file format.

sammyjava commented 4 years ago

I prefer not to make assumptions about "association" in general, I'm just a miner, so associatedGenes are simply listed in the file that provides the GWASResult and we bump responsibility up to the GWASser. There are many files with associated genes listed, who knows how those were determined; that's why we include the publication so you can yell at them not me.

The only associations I generate in post-processors are genes-QTLs from legfed-populate-gene-spanning-qtls, from marker-gene overlap, and I suppose I could do that with all markers including those on GWAS, but I'd call them "overlappingGenes" not "associatedGenes", since "associated" implies (at least to me) some sort of functional relationship. I know in my maize work I get tons of genes under a BSA QTL and presumably only one of them is the functional one, associated with the phenotype; the others are just linkage. But it'd be a post-processor anyway, not something done during loading. And we'd still have the opportunity to associate genes from the publication.

adf-ncgr commented 4 years ago

it might be nice to have a post-processor for proximal genes, though in many cases it will presumably be redundant with the marker overlap. looking forward to seeing the exemplar files and moving on.

ekcannon commented 4 years ago

Good to see some of the supporting data schema and the ontologies. Now I'll expose my ignorance about GWAS data: is a GWAS result associated with a phenotype, or with a phenotype+value?

adf-ncgr commented 4 years ago

with a set of phenotype values and a set of genotypes across the same set of lines ("strains").

ekcannon commented 4 years ago

Ah: the thing measured is one phenotype with one or more phenotype values.

So a new phenotype record with its collection of reported values would be created for a single GWASresult? In my first read of the schema, I thought the same phenotype record would be associated with GWASresults across multiple studies and the phenotype value collection would contain all possible values for that phenotype, not just the ones measured/detected in a particular study. If this were the case, then it wouldn't be possible to report the specific values that are relevant to a particular GWASresult.

sammyjava commented 4 years ago

Probably easiest at this point to show a current file (not a proposed file) with the data I currently load with my loader. In GWAS, a marker is associated with a segregating Phenotype (susceptible, seed weight, leaf length); PhenotypeValue is not supplied in the file; those were used to measure the segregation. But phenotypes do have associated values (true, 500g, 6cm) so the Phenotype class has a PhenotypeValue collection that can be used in other places in a mine. Because data in mines is used in lots of places.

Note that this file will be broken up in the datastore proposal, separating GWAS-specific stuff from separate marker-genome stuff. Note that p-value is missing; that is the case with a bunch of Soybase data. So this doesn't show on a Manhattan plot. We could decide to disallow "GWAS" that lacks p-values, since that's sorta the whole point of GWAS.

TaxonID 3847
Strain  Williams82
Name    KGK20171002.1
PlatformName    SoySNP50K
PlatformDetails iSelect BeadChip
NumberLociTested        33957
NumberGermplasmTested   374
Assembly        Wm82.a1
DOI     10.1534/g3.115.021774
#phenotype      ontology_identifier     marker  p_value chromosome      start   end
Ureide content  SOY:0002231     ss715580059             Gm01    50879523        50879523
Ureide content  SOY:0002231     ss715580069             Gm01    50933494        50933494
sammyjava commented 4 years ago

@ekcannon I've never seen the underlying phenotype measurements reported in GWAS results, at least not so far. Maybe in supplemental data. Just the phenotype and association significance (p-value). And yes, a Phenotype is a standalone thing and a PhenotypeValue is a value it had in some measurement in some experiment done somewhere, not necessarily a GWAS.

StevenCannon-USDA commented 4 years ago

Sam - reporting some feature requests/questions from Rex (@maxglycine - just invited to legumeinfo)

Looking at the KGK20170808.1 record (at least as it gets rendered on the mine page):

GWAS features: can per-feature IDs be displayed (e.g. "Seed weight 5-g1"), or at least reported in the roll-over or table report? This would be to enable linking/correspondence with SoyBase.

The GWAS dataset title (KGK20170808.1): preferable would be a human-readable alias, e.g. "Zhang et al. 2016a" (In your example file above, would this alias be a new field - or would it be in a different table?)

Display: In the Manhattan-type plot, the the X-axis is a bit difficult to interpret. Is it possible to indicate chromosome boundaries?

sammyjava commented 4 years ago

This issue is NOT about front-end displayers and widgets. It's about the underlying data model and datastore file format for GWAS data. Happy to add an attribute to GWAS. Front-end code should be addressed elsewhere.

adf-ncgr commented 4 years ago

Let's convert the other items to new issues, then. I wouldn't be surprised if there already is one about the Manhattan display. We need to facilitate receiving feature requests- I'll try to take care of this set.

StevenCannon-USDA commented 4 years ago

Fleshing out the comments yesterday, with respect to the example data for KGK20171002.1, it looks basically OK to me, with two suggesions* ** ... and given the caveat that "this file will be broken up in the datastore proposal, separating GWAS-specific stuff from separate marker-genome stuff.").

sammyjava commented 4 years ago

I think you're talking about QTLs, which are a different beast. If a GWAS leads to QTL identification, those data would be loaded as QTLs which have marker associations. My plan for loading GWAS is to simply have markers and association p-values; ranges of association with identifiers is "downstream analysis" represented by the QTL model. Soybase is not a general case of GWAS in this way, it's a combination of GWAS and QTL analysis. Which is fine, it just means an extra QTL file needs to be created with the usual relationships to markers.

Remember, InterMine is EXTREMELY normalized. Everything is loaded as separate little chunks, and then the merging handles tying things together. A marker in a GWAS and a marker in a QTL are the same marker.

sammyjava commented 4 years ago

FWIW here's the current (SoyMine) QTL model which is NOT part of this issue, but placed here so you can see how it holds markers (and "spanned genes" from those markers, derived in post-processing). Easy to add another "GWAS" reference or collection to show the GWAS that led to the QTL and vice versa on the GWAS page. This SoyMine-specific spin of QTL holds some custom attributes imported from SoyBase.

<class name="QTL" extends="Annotatable" is-interface="true">
        <attribute name="description" type="java.lang.String"/>
        <attribute name="identifier" type="java.lang.String"/>
        <attribute name="analysisMethod" type="java.lang.String"/>
        <attribute name="secondaryIdentifier" type="java.lang.String"/>
        <attribute name="publicationLinkageGroup" type="java.lang.String"/>
        <attribute name="studyTreatment" type="java.lang.String"/>
        <attribute name="peak" type="java.lang.Double"/>
        <reference name="organism" referenced-type="Organism"/>
        <reference name="phenotype" referenced-type="Phenotype" reverse-reference="QTLs"/>
        <reference name="favorableAlleleSource" referenced-type="Strain" reverse-reference="favorableAlleleQTLs"/>
        <collection name="markers" referenced-type="GeneticMarker" reverse-reference="QTLs"/>
        <collection name="spannedGenes" referenced-type="Gene" reverse-reference="spanningQTLs"/>
        <collection name="mappingPopulations" referenced-type="MappingPopulation" reverse-reference="QTLs"/>
        <collection name="genotypingStudies" referenced-type="GenotypingStudy" reverse-reference="QTLs"/>
        <collection name="linkageGroupRanges" referenced-type="LinkageGroupRange"/>
</class>
sammyjava commented 4 years ago

BTW, a nice way to keep track of objects in InterMine is that they have a Sequence Ontology or some other Ontology identifier. Since a QTL has its own Ontology identifier, it is loaded as a standalone Object. And my style is to load objects independently for maximal interoperability. It's really easy to break out a big file into five individual normalized files, and before you know it, another mine only provides three of those, so you're happy that you did it that way.

StevenCannon-USDA commented 4 years ago

"If a GWAS leads to QTL identification, those data would be loaded as QTLs which have marker associations." - OK, I hadn't considered that. I can see the logic. In that case, I think we would want to somehow note that a record is "QTL-flavor GWAS" - i.e. that the association regions derive from GWAS data. That's because, in linking back to SoyBase, I think the target records will be different. At least SoyBase treats them as distinct data types. @maxglycine

sammyjava commented 4 years ago

That "marking" is implemented with relationships in InterMine. If a QTL has a GWAS reference, it was derived from that GWAS. This sort of thing abounds in the mines, there is almost no use for text fields other than boring attributes. If anything is connected to anything else it is done with a relationship (if one-to-one) or collection (if one-to-many).

sammyjava commented 4 years ago

You can also (often) have the reverse relationships as well. So I can add "QTLs" to the GWAS model above as the reverse relationship. Just done.

StevenCannon-USDA commented 4 years ago

OK, cool. Perhaps "done" with my input regarding GWAS representation.

adf-ncgr commented 4 years ago

question regarding QTL-flavor GWAS: will these be like the "old-style" QTLs defined by a single marker (presumably the marker with the best p-val) and a region that is guesstimated around it? or will these be like interval-mapping QTLs defined by peak marker + markers flanking the region (e.g. markers within some p-val threshold of the peak). I suppose that many publications might just give the peak SNP in which case it would have to be the former. I can't recall how we were dealing with the old-style QTLs in soymine, but I seem to think it had the effect that we didn't get spanned genes.

StevenCannon-USDA commented 4 years ago

I am pretty sure the latter - i.e. "like interval-mapping QTLs defined by peak marker + markers flanking the region (e.g. markers within some p-val threshold of the peak" @maxglycine

sammyjava commented 4 years ago

@adf-ncgr There is only one kind of QTL, and it has a markers collection. You can have zero, one or many markers associated with a QTL. As for spanned genes, I just look for overlap of a marker and a gene. If the marker falls inside the gene's (full) region, I create a "spanned gene" associated with the QTL. But that's from a post-processor. One thing I don't do is include the artificial 200cM (or whatever it is) region around a single-marker or designated-peak QTL.

sammyjava commented 4 years ago

We really should move the QTL discussion to a new issue. We get tons of QTLs not loaded from GWAS.

adf-ncgr commented 4 years ago

right this was a question for soybase folks as to whether these would be one-marker QTL. "old-style" QTL was my shorthand for the artificial cM region that they did for single-marker ANOVA type studies from soybean studies of yesteryear

sammyjava commented 4 years ago

Ah, gotcha. Yeah let's try to be really specific and descriptive in these threads so I can understand them as well. :)

adf-ncgr commented 4 years ago

OK. revising slightly your specific description of spanned genes for clarity's sake (for others on the thread), I believe that you look for overlap of the interval defined by the markers collection with gene features; if size(markers collection) == 1 then it is simple overlap of the marker with a gene (if any).

sammyjava commented 4 years ago

Yes, thank you. It's the genes that are "spanned" by the full range of the markers. And if it's just one marker, it's the gene that marker resides inside of (if any).

adf-ncgr commented 4 years ago

regarding @cann0010 "for linking back to Soybase" comment, do we currently record these accessions/identifiers as having a provenance so we know where to link to? most of our current external linking is (I believe) being mediated by the gene linkout service. I imagine that the External Links section (currently empty) of the page: https://mines.legumeinfo.org/soymine/portal.do?class=GWAS&externalids=KGK20170714.1 for example, will need a bit of extra directive to tell it where to plug that externalid into a URL? but we should probably make this explicit in the files we load, methinks.

adf-ncgr commented 4 years ago

I am pretty sure the latter - i.e. "like interval-mapping QTLs defined by peak marker + markers flanking the region (e.g. markers within some p-val threshold of the peak" @maxglycine

that would be ideal whenever possible.

sammyjava commented 4 years ago

I regard link-outs to be a front-end coding issue. If I have an identifier that can be turned into a SoyBase URL, I can do that. I'm not sure we need to load those into the datastore files.

sammyjava commented 4 years ago

@cann0010 By the way, as a matter of principle, I try very hard to avoid analysis assumptions in the mines, which are meant to be pure "data warehouses" and not purveyors of original analysis. So when you mention markers within some p-value threshold of the peak, that is something that I would expect to be handled in the creation of the datastore files. All I want to know is which markers are associated with which QTLs, and I reference the paper where the assumptions were made. That's the style of mines, they're really just "libraries" in principle, with massive amounts of provenance info.

adf-ncgr commented 4 years ago

I regard link-outs to be a front-end coding issue. If I have an identifier that can be turned into a SoyBase URL, I can do that. I'm not sure we need to load those into the datastore files.

I was imagining we could be in a context where some externalIdentifiers were relevant to one data source, others to other data sources. how would you code your front end logic to be omniscient as to their provenance? And remember that you may not be the only consumer of these files so it would be nice to make the linking logic pretty brain-dead.

adf-ncgr commented 4 years ago

one other thing apropos of identifiers/links; in a recent AgBioData mtg a representative of the AGR project mentioned their use of "curies". This seems to be a W3C standard for basically making a URL-prefix namespace: https://www.w3.org/TR/2010/NOTE-curie-20101216/ maybe worth considering as something to help us in this regard

sammyjava commented 4 years ago

@adf-ncgr That's interesting (curies), pop that over to the datastore issues for further ponderings.

sammyjava commented 4 years ago

@adf-ncgr And yes, good point about brain-dead identifiers. But that, of course, requires that the URL be persistent beyond various SoyBase changes, say. I'm always a little concerned about hardcoding URLs to resources I have no control over.

maxglycine commented 4 years ago

OK, I think I now know how to comment. This thread is very difficult for me to follow in that I don't know the mines schema. So I won't be able to comment about how one could do what I think should be done. I will just give what I would expect as a user.

Now some SoyBase background.

In SoyBase there are two slightly different notions of a "QTL", specifically a "bi-parental QTL" and a "GWAS QTL". They have some different attributes. Both are based on a measurable or scoreable phenotype (purple vs white, Chlorosis score of 1-5, plant height). As you might imagine there are few universal values such as purple vs white flowers. Score values are supposed to be based on a published criterion but the interpretation of that criterion seems to vary. Even plant height is subject to interpretation. That being said, the absolute value of any measurement is relative to the comparison being made. Two short plants that are used in a comparison only have to have a measurable contrast. Then there is the issue of "p-value" and cutoffs. There are no universal cutoffs, there are just popular ones.

SoyBase Philosophy

GWAS QTL

All GWAS QTL are defined by one or more markers either a SNP or other sequence-based marker.

Most GWAS QTL will be a single marker. This is because the authors just gave us a list of "significant" markers.

In some cases, the authors assert that a list of markers constitute a single QTL In some cases, the curators had some reason to believe that a list of markers constituted a single QTL. In those cases the format of the QTL name changes. In just single marker QTLs the format is "phenotype" Number1 g Number2, where Number1 = the number of the GWAS paper and Number2= the number of the QTL in that study, just numbered 1 - n where n is the total number of significant QTL reported in the paper. In case where either a curator or authors determined that more than one of those significant markers really constituted the SAME QTL then we added a "dot" and a Number3. Thus the first GWAS QTL from the 8th GWAS paper entered would be First flower 8-g1. We added the "g" so that the user could easily see it was a GWAS QTL. In the case where a curator or authors asserted that 3 markers together constituted a single QTL we would add a Dot and a Number3 so the 3rd QTL from the 8th GWAS paper was composed of 3 markers then the names given to those markers would be First flower 8-g3.1, First flower 8-g3.2 and First flower 8-g3.3. Internal Identifiers

SoyBase never exposes the internal identifier to the public. It is uninformative to them and may confuse them. Instead we convert the internal identifier into some informative string like a gene name or QTL name. In the case of references (papers) we convert the internal identifier into a string that represents the "title" of the paper and its "source" (ie Smith et al. 2020 NAR 15(3):254-257). We harvest the paper data including the abstract because sometimes it is useful to see it before you go off and try to download the paper.

SoyMine I note that there are no unique names for a GWAS QTL. They all say Seed weight or whatever the phenotype measured was. How would you refer to the 3rd Seed weight QTL on chromosome 3?

I note that when I went I click on any "GWAS Results Marker" section item of the query interface, I could not find positional information for any of the markers listed. Is that a bug or feature?

Longevity To a first approximation, I would assume that SoyBase has the same longevity as LIS/LegFed so I would not feel too queasy linking to it. Since we moved to MySQL, the URLs have not changed format and I don't anticipate that they will. New "categories" may be added but existing "categories" will not change.

I think that covers most of the issues - maybe?

adf-ncgr commented 4 years ago

@adf-ncgr That's interesting (curies), pop that over to the datastore issues for further ponderings.

not sure it is sufficiently well-formed to merit popping anywhere else, but feel free to do so if you think it will help

@adf-ncgr And yes, good point about brain-dead identifiers. But that, of course, requires that the URL be persistent beyond various SoyBase changes, say. I'm always a little concerned about hardcoding URLs to resources I have no control over.

that's a point; perhaps having a mediation layer like the linkout service is appropriate; though I see @maxglycine has just indicated that Soybase URLs are going to be Cool URIs even though Soybean is a Warm Season Legume :)

sammyjava commented 4 years ago

SoyMine I note that there are no unique names for a GWAS QTL. They all say Seed weight or whatever the phenotype measured was. How would you refer to the 3 Seed weight QTL on chromosome 3?

I think you're confusing a mine's QTL object with Phenotype object, very different things. A Phenotype is a thing that is defined in an Ontology. It has values like 6cm or true or 20g. A QTL is a particular locus at which that phenotype segregates in an experiment.

In the case of the current SoyMine, here's an example list of GWASResults:

Gwas          | Phenotype    | Marker         | p-value 
KGK20171027.1 | Plant height | Chr19_44323024 | 1.48e-27
KGK20171027.1 | Plant height | Chr19_45205190 | 1.48e-27
KGK20171027.1 | Plant height | Chr19_46340503 | 1.48e-27
KGK20170711.1 | Plant height | ss715635425    | 3e-15
LBC20180516.2 | Plant height | ss715635425    | 3e-15

Those significant associations with markers could be entered as "QTLs" but in any case, they have clear genomic, and genetic, position via the markers. "Plant height" is a Phenotoype instance. It corresponds to SOY Ontology term SOY:0001365. (And, an example of failed merging: if you click on that phenotype you get 95 identical Plant height->SOY:0001365 records when there should only be one.)

Also, note that I'm using SoyBase as a data source for SoyMine, but I am not preserving all the data structures or even some labels, as I need the mines to use the same data model. For a GWAS, I'm looking for significantly associated markers, because that's what GWAS present, typically in a Manhattan plot. QTLs are normally found from mapping experiments, often bi-parental.

I note that when I went I click on any "GWAS Results Marker" section item of the query interface, I could not find positional information for any of the markers listed. Is that a bug or feature?

Neither. Markers have both genomic and genetic locations in the mine. The genomic locations are found under "Chromosome" and "Chromosome location", as with all sequence features, and the genetic locations are found under "Linkage group positions". Here's an example:

DB identifier | Chromosome | Start    | End      | Linkage Group     | Position (cM)
AF186183      | Gm07       | 17567660 | 17567861 | GmComposite2003_M | 77.23