ga4gh-beacon / specification-v2

GA4GH Beacon v2 specification.
Apache License 2.0
3 stars 6 forks source link

Is this clarification still necessary? start[0]=start[1] XOR end[0]=end[1] #3

Closed sdelatorrep closed 3 years ago

sdelatorrep commented 4 years ago

I wonder if this clarification is still necessary, now that in v2 start and end are arrays, and we don't have startMin, startMax, endMin and endMax any more:

single or double sided precise matches can be achieved by setting 
`start[0]` = `start[1]` XOR `end[0]` =  `end[1]`

It is found here:

    GenomicVariantFields:
      description: |
        All the required fields to query any kind of variant (e.g. SNP, DUP, 
        etc.).
      type: object
      required:
        - assemblyId
        - referenceName
        - start
      properties:
        assemblyId:
          $ref: '#/components/schemas/Assembly'
        referenceName:
          $ref: '#/components/schemas/Chromosome'
        start:
          description: |
            Precise or fuzzy start coordinate position(s), allele locus 
            (0-based, inclusive).
            * start only:
              - for single positions, e.g. the start of a specified sequence 
              alteration where the size is given through the specified 
              `alternateBases`
              - typical use are queries for SNV and small InDels
              - THIS IS NOT TRUE FOR RANGE QUERIES!!!! -> the use of "start" 
              without an "end" parameter requires the use of "referenceBases"
            * `start` and `end`:
              - special use case for exactly determined structural changes
            * use 2 values for querying imprecise positions (e.g. identifying 
            all structural variants starting anywhere between `start[0]` <-> 
            `start[1]`, and ending anywhere between `end[0]` <-> `end[1]`)
            * IS THIS NECESSARY???? -> single or double sided precise matches 
            can be achieved by setting `start[0]` = `start[1]` XOR `end[0]` = 
            `end[1]`
          type: array
          items:
            type: integer
            format: int64
          minimum: 0
        end:
          description: |
            Precise or fuzzy end coordinate(s) (0-based, exclusive). See start. 
            For fuzzy matches, provide 2 values in the array (e.g. [111,222]).
          type: array
          items:
            type: integer
            format: int64
jrambla commented 4 years ago

IMU the example is applicable to the case of startand endbeing matrices. But I guess the message is "incorrect", XOR means "A or B, but not A & B". I believe we can have A & B in our case.

jrambla commented 4 years ago

Anyone commenting on this?

mbaudis commented 4 years ago

Actually start and end should be simpleIntervals in GA4GH VRS speech (0 based half open). So the first base would be [0,1].

mbaudis commented 4 years ago

... so a precise single base location is "==start[0]".