matrix-org / matrix-spec

The Matrix protocol specification
Apache License 2.0
188 stars 94 forks source link

Power level values can be ints or strings #853

Closed erikjohnston closed 2 years ago

erikjohnston commented 3 years ago

A long time ago there was a bug in Synapse where the values for the power level were strings rather than ints. This was fixed, but in order to not break rooms Synapse continued to accept power level values that were either ints or strings.

This behaviour should be documented.

This behaviour is also implemented in gomatrixserverlib/Dendrite and Ruma/Conduit.

Ideally we should be able to only document this for v1 rooms, but I haven't checked if this was fixed before or after v2 rooms were introduced (but I assume so).

erikjohnston commented 3 years ago

Further investigation reveals that Synapse still doesn't validate incoming power level events properly :facepalm:

richvdh commented 3 years ago

related: matrix-org/synapse#10232?

dkasak commented 3 years ago

The linked PR adds validation to ensure power levels are integers. Since there was no validation before, will we need a room version bump to ensure we don't break older rooms which might not be upholding this?

richvdh commented 3 years ago

Since there was no validation before, will we need a room version bump to ensure we don't break older rooms which might not be upholding this?

In theory, I suppose so. But at this point it's hard to see us encountering stringy PL events unless somebody is maliciously trying to split-brain the room. Since sending PL events of any flavour requires you to have admin perms on the room, I'm inclined to say that if any admins want to split-brain their own room, they can go right ahead. It might be better just to get the benefit of validation out sooner.

erikjohnston commented 3 years ago

Since there was no validation before, will we need a room version bump to ensure we don't break older rooms which might not be upholding this?

In theory, I suppose so. But at this point it's hard to see us encountering stringy PL events unless somebody is maliciously trying to split-brain the room. Since sending PL events of any flavour requires you to have admin perms on the room, I'm inclined to say that if any admins want to split-brain their own room, they can go right ahead. It might be better just to get the benefit of validation out sooner.

I'm nervous about changing the allowed format of events in existing rooms TBH, especially for something like power levels which can have extensive knock on effects. Unless we can be comfortable that no rooms exist in the wild with string value power levels in their auth chain I'm really very dubious about not cutting a room version.

We should absolutely be validating this on the client side though.

neilalexander commented 2 years ago

Can we please please please spec exactly what the allowed string formats are for legacy support? Synapse allows things like " +2 " but Dendrite doesn't and it would be useful to have something in the spec that states exactly what int() behaviour we should be trying to emulate and which we shouldn't bother with, since there's a whole load of potential footguns there.

ShadowJonathan commented 2 years ago

From the python docs;

Optionally, the literal can be preceded by + or - (with no space in between) and surrounded by whitespace.

Base 0 means to interpret exactly as a code literal, so that the actual base is 2, 8, 10, or 16, and so that int('010', 0) is not legal, while int('010') is, as well as int('010', 8). (i.e. setting the powerlevel to 0100 is also legal, but doesnt do octal)

This seems to be the extent of it.

So an algorithm for normalising pre-MSC3667 powerlevels seems to be;

  1. trim all whitespace
  2. remove leading +
  3. if leading character is -, remove and remember
  4. remove leading zeroes (except the last one, if remaining string is all-zeroes)
  5. interpret string as an integer
  6. if - was encountered, negate the integer
turt2live commented 2 years ago

we'd also have to deal with - and possibly other representations. It'll be easier for the spec to link to the python docs if they can be pinned reliably.

also, just to mention it because it's in the older thread: synapse still parses and handles strings as ints in modern room versions. It has not been formally fixed yet.

ShadowJonathan commented 2 years ago

we'd also have to deal with -

Apologies, somewhere in the back of my mind, i thought that powerlevels were unsigned, and - would be an error, that's not the case.

and possibly other representations.

These are all the possible representations (according to the python docs), i'm not aware of any more.

richvdh commented 2 years ago

Trying to understand what the current state of play is here. I think:

turt2live commented 2 years ago

for clarity, and because I never wrote it down, the plan is to address the last point at the same time as adding MSC3667 to the spec. It's mostly a matter of effort, and given we'll be in the area when writing the MSC up it makes sense to go back and fix the other room versions.

If for some reason that ends up taking forever then we can obviously go through and write the words into the spec appropriately. I would encourage us to make an effort to get MSC3667 into an assigned room version, though.

turt2live commented 2 years ago

Fixed by https://github.com/matrix-org/matrix-spec-proposals/pull/3667