oxidecomputer / omicron

Omicron: Oxide control plane
Mozilla Public License 2.0
251 stars 39 forks source link

NameOrID resolution in networking db queries allow invalid UUIDs #6349

Open internet-diglett opened 2 months ago

internet-diglett commented 2 months ago

Currently we look up the parent record if the caller provides a NameOrId::Name, but does not look up the parent record if the caller provides NameOrId::Id. Since we don't use "real" Foreign Keys in our tables, the queries / DML will continue along happily if the user provides a Uuid that does not exist in the parent table, leading to child records that point to a parent that doesn't exist.

An example of such resolution logic can be found here: https://github.com/oxidecomputer/omicron/blob/8b3e948dd8afab54d5fe0c821beb4af5a83b7701/nexus/db-queries/src/db/datastore/bgp.rs#L46-L57

So we need to audit for similar structures and update them to perform an actual lookup and verify that the uuid points to a real record.

david-crespo commented 2 months ago

We try to avoid having the datastore layer be aware of API params structs or NameOrId at all (with varying degrees of success). So ideally the lookup would happen one level up:

https://github.com/oxidecomputer/omicron/blob/b50fe3a792e0a1ef5f215acf7a6a644fbda8aff2/nexus/src/app/bgp.rs#L18-L26

internet-diglett commented 2 months ago

We try to avoid having the datastore layer be aware of API params structs or NameOrId at all (with varying degrees of success). So ideally the lookup would happen one level up:

https://github.com/oxidecomputer/omicron/blob/b50fe3a792e0a1ef5f215acf7a6a644fbda8aff2/nexus/src/app/bgp.rs#L18-L26

:thinking: How would we avoid TOCTOU races (like ensuring the parent record doesn't get deleted as we're creating a child record)? This could put is in a weird situation where a race could cause us to have child records that we cannot delete (since we might need to lookup the parent record by name before deleting the child record). Since we don't have real foreign keys it seems like the most straightforward way to prevent this (for now) is to lookup the parent as a part of the transaction / CTE?

zephraph commented 2 months ago

I don't have the full context, but the general intent was essentially that we want to resolve away names as early in the API call as possible. Names are generally mutable and heavily context dependent where the IDs shouldn't be. If you're doing a create operation you'll likely have a CTE of some sort because you'll have a name or id of the parent that you will want to resolve at the same time as the create. I think it most all other cases hoisting the resolution is more desirable.

david-crespo commented 2 months ago

In general I don't think we are currently avoiding TOCTOU problems very well — there are a lot of spots where we're theoretically vulnerable to that (though the fact that it AFAIK hasn't come up says something). But I think you could resolve name to ID a level up and then still do a check inside the transaction that an announce set with that ID exists. It is an extra query, but that doesn't seem too bad.

david-crespo commented 1 month ago

@internet-diglett looks like this was fixed here — did anyone test it on dogfood or colo?

internet-diglett commented 1 month ago

@david-crespo I think this is still open because we have some occurrences in other networking logic that need to be cleaned up.