quicwg / base-drafts

Internet-Drafts that make up the base QUIC specification
https://quicwg.org
1.63k stars 204 forks source link

Largest Reference algorithm can produce invalid values #2112

Closed martinthomson closed 5 years ago

martinthomson commented 5 years ago

Feed the algorithm a LargestReference of 1, TotalNumberOfInserts of 2, and MaxEntries of 10.

   if LargestReference > 0:
      LargestReference -= 1  # == 0
      CurrentWrapped = TotalNumberOfInserts mod (2 * MaxEntries) # == 2

      if CurrentWrapped >= LargestReference + MaxEntries:  # == False
         # Largest Reference wrapped around 1 extra time
         LargestReference += 2 * MaxEntries
      else if CurrentWrapped + MaxEntries < LargestReference:   # == False
         # Decoder wrapped around 1 extra time
         CurrentWrapped += 2 * MaxEntries

      LargestReference += TotalNumberOfInserts - CurrentWrapped # == no change

LargestReference is therefore 0, and invalid.

Do the same for LargestReference of 18 and you get something even better: -2.

I think that the underflow check in #1904 fixes this.

afrind commented 5 years ago

To get LR=1 on the wire, the original LR had to be 0 (which should have been encoded as 0), or it must be 0 mod 2*MaxEntries. If TotalNumberOfInserts is 2, then the real LR can only be as large as 12, not 20. Similarly, LR=18 is outside the valid range when TotalNumberOfInserts is 2.

I think I noticed this once before, but can't remember in which forum I made the comment - perhaps it was on slack.

What does #1904 come up with for these invalid inputs?

martinthomson commented 5 years ago

1904 turns 0 into 2*MaxEntries (the only valid value that encoding could produce). Basically it doesn't allow underflow at all.

afrind commented 5 years ago

I don't love this algorithm so I can just take the other one. But I think this shows a bigger problem, which is that not all wire values are valid inputs and the algorithm should provide a mechanism for determining when they are invalid.

martinthomson commented 5 years ago

Or we could just allow all values to be valid, just with a different interpretation.

afrind commented 5 years ago

What interpretation do you suggest for LR=18, TNOI=2, ME=10?

dtikhonov commented 5 years ago

Allowing values outside of the [0, MaxEntries * 2] range would break the assumption that the LargestReference can be used as an index into a ring buffer, which was part of the deal.

afrind commented 5 years ago

Sorry, I meant logical 20, pre-encoding. I'll update the question to be more clear.

martinthomson commented 5 years ago

A value of 18 on the wire would turn into a value of 17 until the number of entries received indicated that this rolls over. See the algorithm in #1904.

afrind commented 5 years ago

An encoder using index 18 when the decoder is at 2 and ME is 10 is doing something illegal per the current spec. The maximum index is 12. The decoder needs to make this a connection error - or you are suggesting we should relax the rule about allowing evicting unacknowledged headers?

martinthomson commented 5 years ago

Not at all, I'm just answering the question about the algorithm. What we do with the output of the algorithm doesn't need to change. Using 18 is invalid.

bencebeky commented 5 years ago

I was able to convince myself that the current algorithm cannot produce invalid values. Here's my reasoning:


# Because of the way Largest Reference is encoded.
assert(0 < LargestReference)
assert(LargestReference <= 2 * MaxEntries)
# A decoder should signal an error if the input is out of this range.

if LargestReference > 0:
  LargestReference -= 1

  assert(0 <= LargestReference)
  assert(LargestReference < 2 * MaxEntries)

  CurrentWrapped = TotalNumberOfInserts mod (2 * MaxEntries)

  assert(0 <= CurrentWrapped)
  assert(CurrentWrapped < 2 * MaxEntries)

  if CurrentWrapped >= LargestReference + MaxEntries:
    # Largest Reference wrapped around 1 extra time
    assert(CurrentWrapped >= LargestReference + MaxEntries)
    LargestReference += 2 * MaxEntries
    assert(CurrentWrapped >= LargestReference - MaxEntries)

    # Because CurrentWrapped < 2 * MaxEntries
    # and LargestReferencee >= 2 * MaxEntries.
    assert(CurrentWrapped - MaxEntries < LargestReference)

  else if CurrentWrapped + MaxEntries < LargestReference
    # Decoder wrapped around 1 extra time
    assert(CurrentWrapped + MaxEntries < LargestReference)
    CurrentWrapped += 2 * MaxEntries
    assert(CurrentWrapped - MaxEntries < LargestReference)

    # Because LargestReference < 2 * MaxEntries
    # and CurrentWrapped >= 2 * MaxEntries.
    assert(LargestReference <= CurrentWrapped + MaxEntries)

  # If either of these had been false, one of the branches above would have been
  # executed, making both of these true.
  assert(LargestReference <= CurrentWrapped + MaxEntries)
  assert(CurrentWrapped - MaxEntries < LargestReference)

  assert(TotalNumberOfInserts mod (2 * MaxEntries) == CurrentWrapped mod (2 * MaxEntries))
  # therefore the following line does not change LargestReference mod (2 * MaxEntries).

  LargestReference += TotalNumberOfInserts - CurrentWrapped

  # Substituting into the above inequalities, we get
  assert(TotalNumberOfInserts - MaxEntries < LargestReference)
  assert(LargestReference - MaxEntries <= TotalNumberOfInserts)

  # This is exactly what we want:
  # Entry TotalNumberOfInserts - MaxEntries is evicted,
  # entry LargestReference is referenced,
  # entry LargestReference - MaxEntries is presumably evicted
  # (unless the encoder references LargestReference it but does not emit it
  # until later, which is crazy),
  # entry TotalNumberOfInserts is acknowledged.
  # The spec requires that an evicted entry cannot be referenced and
  # every evicted entry must be acknowledged.

Therefore I belive that this issue can be closed.

However, I am worried that TotalNumberOfInserts - CurrentWrapped might underflow if one is doing unsigned arithmetics.

bencebeky commented 5 years ago

To address Martin's original concern about 0 or negative outputs, my point is that the decoding algorithm outputs a LargestReference that is the same modulo (2 MaxEntries) as the original, and is in the same 2 MaxEntries wide range as the original (as long as the original was valid), therefore must be the same as the original. That is, in agreement with afrind@ above, I believe that the decoding algorithm will produce valid (positive) decoded LargestReference values as long as it is fed an input wire-LargestReference that was obtained by encoding a valid LargestReference value given TotalNumberOfInserts and MaxEntries.

bencebeky commented 5 years ago

An encoder using index 18 when the decoder is at 2 and ME is 10 is doing something illegal per the current spec. The maximum index is 12. The decoder needs to make this a connection error - or you are suggesting we should relax the rule about allowing evicting unacknowledged headers?

I like the idea of making it a connection error. https://github.com/quicwg/base-drafts/pull/2249

afrind commented 5 years ago

Discussed in Tokyo. The consensus in the room is that the proposed algorithm is acceptable.