I spent some time on this today, here are some findings. tl;dr Cosmos limitations around LIMIT/OFFSET make subquery pushdown only barely useful, as LIMIT/OFFSET are the main reason. I'll shelf this for now - see WIP in https://github.com/roji/efcore/tree/CosmosPushdown.
Cosmos doesn't support LIMIT/OFFSET in subqueries.
As a result, we have special translation logic for LINQ Skip/Offset in subqueries; rather than translating to LIMIT/OFFSET, we convert the subquery in question to an array via the Cosmos ARRAY() operator, and compose ARRAY_SLICE() over that. This allows us to translate e.g. context.Blogs.Where(b => b.Ints.Skip(2).Take(3)...). Note that these query shapes don't involve a pushdown - the translation of the Where predicate is aware that it's in subquery context, and takes that into account.
An actual pushdown needs to happen when LINQ operators are composed in an order which would be incompatible with the SELECT clause evaluation order. For example, at the top-level, if Take().Skip() are composed, Skip() would need to trigger a a pushdown to make sure Take() is evaluated first.
Unfortunately, the ARRAY_SLICE() technique can't be used when pushing down a top-level query, since Cosmos doesn't support referencing the root container from inside an ARRAY() subquery. So although SELECT * FROM (SELECT * FROM root c) works fine (note root referenced inside the subquery, the following does not: SELECT * FROM (SELECT VALUE ARRAY(SELECT * FROM root c)). This makes it impossible for us to perform pushdown on a (top-level) query that has LIMIT/OFFSET.
Apart from LIMIT/OFFSET which can't be supported, the remaining causes for pushdown are:
Composing aggregate functions (e.g. Count) over DISTINCT
Composing ORDER BY over DISTINCT (but this shouldn't actually be needed, see #34121).
Note that since for subqueries we already avoid LIMIT/OFFSET, translating ARRAY_SLICE() as above; so once again there's little need to push down within subquery context as well (only the above two cases).
As a result, I'm shelving this effort for the meantime and will consult with the Cosmos people. If LIMIT/OFFSET in subqueries becomes supported, or some other workaround is found, we can bring this back and implement proper pushdown.
I spent some time on this today, here are some findings. tl;dr Cosmos limitations around LIMIT/OFFSET make subquery pushdown only barely useful, as LIMIT/OFFSET are the main reason. I'll shelf this for now - see WIP in https://github.com/roji/efcore/tree/CosmosPushdown.
context.Blogs.Where(b => b.Ints.Skip(2).Take(3)...)
. Note that these query shapes don't involve a pushdown - the translation of the Where predicate is aware that it's in subquery context, and takes that into account.SELECT * FROM (SELECT * FROM root c)
works fine (note root referenced inside the subquery, the following does not:SELECT * FROM (SELECT VALUE ARRAY(SELECT * FROM root c))
. This makes it impossible for us to perform pushdown on a (top-level) query that has LIMIT/OFFSET.