AutoMapper / AutoMapper.Extensions.OData

Creates LINQ expressions from ODataQueryOptions and executes the query.
MIT License
140 stars 38 forks source link

Feature request: Add support for CosmosDB #166

Open wbuck opened 1 year ago

wbuck commented 1 year ago

It would be nice to have support for working directly with the Cosmos DB LINQ provider as opposed to working with the EF provider as EF does not generate valid SQL in many situations.

wbuck commented 1 year ago

The one thing I'm unsure about is how to handle the $expand option here. What makes sense from a users perspective.

  1. Ignore the expand option
  2. Generate a SelectMany call when $expand is specified (I don't think the user would expect this)
  3. Throw an InvalidOperationException
  4. For each query generate an implicit Select call which includes all properties except for collection properties on the runtime generated type. If $expand is specified then "include" the specified collection property in the runtime generated type.
BlaiseD commented 1 year ago

It depends on whether the concept of navigation properties (both entities and collections) exists in Cosmos DB. If it does not then I would throw the exception. If it does exist then how is it represented as a LINQ query i.e. do what you would do manually.

Maybe it would help to observe the existing code with regard to select and expand, and what happens when none, all or a partial set have been specified.

wbuck commented 1 year ago

In this case there is no concept of a navigation property in Cosmos DB. You can have collections (primitives or objects) but they're not navigation properties. Nothing officially connects one document to another.

You can of course embed an ID of one document into another:

{
    "id": "some user id",
    "name": "bill",
    "posts": [ 
        "id": "some post id",
        "userId": "some user id"
    ] 
}

But you cannot "join" those documents together (either with the C# SDK or Cosmos DB SQL API).

E.g., Users.Include(u => u.Posts) Is not valid. Two separate queries would have to be made in this case, one to get the users and one to get the posts of a particular user.

First: Users.Where(u => u.Name.Contains("bill"); Followed by: Posts.Where(p => p.UserId == user.Id);

wbuck commented 1 year ago

@BlaiseD Do you know off the top of your head why GetAsync doesn't call GetQueryAsync and then just call ToListAsync directly on the returned IQueryable<T>?

BlaiseD commented 1 year ago

Get, GetAsync were the initial implementation, used joins (includes), call Map and return a collection. GetQuery, GetQueryAsync came later, use projection and return an IQueryable.

wbuck commented 1 year ago

Just doing manual testing at the moment, everything looks good. One question I do have is what Odata query options are currently supported?

I know that the following are supported:

I don't believe transformations are currently supported, e.g., $apply correct?

Maybe after this PR I'll look into adding transformations, but it'll probably require creating a dynamic runtime type (which I believe I could use this for). Also, returning an IQueryable<TModel> wouldn't be possible. But that's a problem for another day.

BlaiseD commented 1 year ago

I believe that's all for options. FYI - there's a discussion here about configuring the EnableQueryAttribute to ignore the supported options and handle the apply piece - as a workaround.

BlaiseD commented 1 year ago

Some dedicated classes may be worth considering for transforms. There are scenarios e.g. iOS Xamarin/Maui where anonymous types won't work.

wbuck commented 1 year ago

It looks as if I've found a bit of a bug in how we handle expansions. Currently in order to expand a complex type the following syntax is used $expand=MyComplexType\SomeNavigationProperty which we don't seem to currently handle correctly. I need to take a closer look at the LinqExtensions to get a better idea of how we're handling the segments.

I know we're assuming the following expansion syntax: $expand=MyType($expand=SomeNavigationProperty) but this of course implies that MyType is an entity and not a complex type.

This may be a specific CosmosDB concern as I don't believe a collection of complex types makes sense for a relational DB.

wbuck commented 1 year ago

Ok, so it looks like everything is working as expected (at least when tested manually). Now comes the harder portion, testing...

For other projects I've used Testcontainers to spin up a docker image running the Cosmos DB emulator.

I was thinking of doing the same here for integration testing. Thoughts?

BlaiseD commented 1 year ago

Don't think I've used one. The only restriction is that anyone should be able to clone the repository build and test. The tests e.g. AutoMapper.OData.EF6.Tests should also work in the CI and release builds - not the web tests we switch those off.

BTW adding this Cosmos DB piece to your repo for your own NuGet library is also an option - adding it here is great too.

wbuck commented 1 year ago

Using Docker containers (using the Testcontainers library) for testing I believe requires Docker to be installed. Unfortunately there is no in-memory Cosmos database, only the Cosmos DB emulator. My biggest concern was the CI.

Originally I was planning on adding the feature here but if you'd feel more comfortable with me creating a separate Nuget package I can do that as well.

BlaiseD commented 1 year ago

I'm not familiar with the technology so whatever works with the CI. The expression builder tests check a string representation of the expression (in this case IQueryable.Expression) so that's also an option - ultimately that's all the libraries are doing.

Cosmos DB looks like a good fit with the other two libraries. I brought up your own NuGet because of the whole dynamic: my lack of knowledge of Cosmos DB, your work etc.. in case you hadn't thought of it.

wbuck commented 1 year ago

So, it looks like the windows-latest runner already has docker installed, which means I could use the Cosmos DB emulator Docker container in both the CI and Release actions.

The issue, which may be a deal breaker, would be the requirement for the user to have docker installed on their machine in order to run the Cosmos DB tests.

I'm also looking into how I could potentially create a default CosmosClient which is what would provide me with the needed IQueryable in order to test against. If that works, I could generate a QueryDefinition (this is Cosmos specific) and then test against the generated SQL.

The problems here is I wouldn't be able to test the Get and GetAsync extensions methods. There isn't a lot of logic in those two methods as I just call GetQueryAsync and GetQuery and execute the returned IQueryable.

The other issue would be when using $sount as that'll hit the DB.

BlaiseD commented 1 year ago

Do you need Get and 'GetAsync'?. Maybe consider the following for unit tests:

For integration testing just whatever makes sense to you - maybe add a web project WebAPI.OData.CosmosDB and a corresponding web test project with setup instructions in the ReadMe for Docker and the rest. The web test and web project like the current ones should have Build unchecked in the solution configuration.

wbuck commented 1 year ago

During testing I've run into a bit of a snag with the following method defined in the LogicBuilder project. If our model has a collection property of literal type or complex type, those properties are not included in the Select query.

Currently the method is written as follows:

private static MemberInfo[] GetValueTypeMembers(this Type parentType)
    {
        if (parentType.IsList())
            return new MemberInfo[] { };

        return parentType.GetMemberInfos().Where
        (
            info => (info.MemberType == MemberTypes.Field || info.MemberType == MemberTypes.Property)
            && (info.GetMemberType().IsLiteralType() || info.GetMemberType() == typeof(byte[]))//Need typeof(byte[]) for SQL Server timestamp column
        ).ToArray();
    }

The issue is with this line: (info.GetMemberType().IsLiteralType() || info.GetMemberType() == typeof(byte[])). The collection properties (which are not navigation properties) are being filtered out.

In order for this to work 100% correctly I'd need the following:

private static MemberInfo[] GetValueTypeMembers(this Type parentType, IEnumerable<IEdmSchemaElement> elements)
    {
        if (parentType.IsList())
            return new MemberInfo[] { };

        return parentType.GetMemberInfos().Where
        (
            info => 
            (
                info.MemberType == MemberTypes.Field || info.MemberType == MemberTypes.Property
            )
            &&
            (
                info.GetMemberType().IsLiteralType() || info.IsListOfLiteralOrComplexTypes(elements)
            )
        ).ToArray();
    }

    private static bool IsListOfLiteralOrComplexTypes(this MemberInfo memberInfo, IEnumerable<IEdmSchemaElement> elements)
    {
        var memberType = memberInfo.GetMemberType();

        if (memberType == typeof(byte[]))
            return true;

        // Beyond this conditional we assume we're working with a non-relational database. 
        if (elements is null || !memberType.IsList())
            return false;

        var elementType = memberType.GetUnderlyingElementType();
        if (elementType.IsLiteralType())
            return true;

        return elements
            .OfType<IEdmComplexType>()
            .Any(e => e.FullTypeName().Equals(elementType.FullName, StringComparison.Ordinal));
    }

This includes collections of literal or complex types. Currently I've just moved a bunch of the code from the LogicBuiler into the new Cosmos project and changed their signatures to take a collection of IEdmSchemaElement's. The methods in question have to do with building the selects for each of the members.

Is that acceptable to you? Or would you prefer a PR to the LogicBuiler project which updates the required methods to take a collection of optional IEnumerable<IEdmSchemaElement> parameter?

BlaiseD commented 1 year ago

Ok to move the code for now at least. Then complex types are not considered expansions? If they are then maybe the solution is to handle $expand?

You'd mentioned we weren't handling complex types correctly - so may have to move it permanently anyway (or just fix expansions for complex types). I don't think the LogicBuilder project needs the EDM Schema knowledge.

wbuck commented 1 year ago

No, you can't $expand complex types unfortunately, only navigation properties.

The link you showed is for navigation properties nested in complex types. The syntax for that would be: \MyEntity?$expand=MyComplexType\MyNavigationProperty where MyNavigationProperty is an entity of type MyNavigation.

This does expand the MyComplexType as well but in order to do that an entity type would have to be embedded inside of it. If the complex type does not have an embedded navigation property inside of it then you'll be unable to expand the complex type.

For example:

class MyComplex
{
    public int SomeProperty { get; set; }
}

class MyEntity
{
    public Guid Id { get; set; }
    public MyComplex MyComplexType { get; set; }
}

Issuing the following query: \MyEntity?$expand=MyComplexType

Would throw an exception (only navigation properties can be expanded). This means the user of the API would never be able to see the contents of properties that are complex types.

The solution to that problem is to always "include" the complex type, which means creating a Select for each property in the complex type which will then get passed to ProjectTo.

Of course, you could imagine folks having complex types with deeper nesting than what I've shown. I'm looking at the recursive algorithm that's used for creating the Select's for each property (right now is doesn't go deep enough).

BlaiseD commented 1 year ago

Yep I think you're correct the expansions (entities plus complex properties) should be determined in this step before the lambda expressions get built.

wbuck commented 1 year ago

@BlaiseD Yeah that was exactly what I was thinking as well. It took a few days, but it looks like the algorithm for "including" the complex types is working correctly.

One issue (which I don't think I'd tackle for this PR and might actually require changes to ProjectTo) is the SQL that's generated. Although correct, it's really complex even for simple queries which will drive up the RU cost. But that's a problem for another day...

wbuck commented 1 year ago

So I'm getting there but this PR is going to be fairly large. It's difficult making smaller PR's when using the git flow though.. I guess when it finally comes time to actually create the PR you'll have to make the decision whether you feel comfortable accepting such a large PR.

Most of the changes are contained within the new Cosmos project but I did have to make a small change here and here. The issue was we weren't handling any literals except for $it.

This was problematic as that didn't allow for value collection filtering with the $this literal. I had to surface the literal name in the FilterHelper (not a perfect solution but my hands were tied because of the use of this).

Furthermore I had to make changes to the ChildCollectionFilterUpdater to also make filtering a literal collection possible.

BlaiseD commented 1 year ago

Should be fine if most of the PR is new code - best to minimize the changes to existing logic.

wbuck commented 1 year ago

So there appears to be an issue with how we handle sub filters on enum collections. This is not an issue when filtering on root, only when filtering a child member collection.

Take the following Edm Model:

enum MyEnum
{
    Variant1,
    Variant2
}

class ChildEntity
{
    public Guid Id { get; set; }
    public List<MyEnum> MyEnums { get; set; }
}

class RootEntity 
{
    public Guid Id { get; set; }
    public List<ChildEntity> Children { get; set; }
}

With this filter: entities?$expand=Children($filter=MyEnums/any(value: value eq 'Value1'))

Currently we're not casting the enum to an integer and we're not creating a constant integer to represent Value1 in the generated Expression. This causes the query to successfully execute but no results are returned.

We do have some tests which are testing the built expression to ensure we're comparing the enum member property to one of the enum's variants here, but I think this is incorrect.

Right now what I've done is change the generated expression to insert a Convert expression in the FilterHelper. I took the idea from what is already being done with certain date types here

BlaiseD commented 1 year ago

Correct - at the root we still use FilterQueryOption.ApplyTo - not available at the nested level. We're not using FilterHelper at the root because like you've just proved, FilterQueryOption.ApplyTo may handle cases we have missed.

BlaiseD commented 1 year ago

I see the following logic in FilterHelper though - so not sure if is running in your scenario:

                    if (conversionType.IsEnum)
                    {
                        if (!(sourceNode is ConstantNode enumSourceNode))
                        {
                            if (GetClrType(sourceNode.TypeReference) == typeof(string))
                                return new ConstantOperator(null);

                            throw new ArgumentException("Expected ConstantNode for enum source node.");
                        }

                        return new ConvertToEnumOperator
                        (
                            GetConstantNodeValue(enumSourceNode, conversionType),
                            conversionType
                        );
                    }
wbuck commented 1 year ago

Yeah that was one of the first things I saw so I put a break point in there last night and it never hit. That piece of code you showed would only get hit if the user is manually casting:

entities?$expand=Children($filter=MyEnums/any(value: cast(value, Edm.String) eq 'Value1'))

What I've tentatively done is create a new type implementing the IExpressionPart interface which inserts the needed convert expressions after we've constructed the IExpressionPart tree.

Ideally the convert node would be added while we're building the tree of course but this is a more complicated task. I'll take a closer look at that later today.

BlaiseD commented 1 year ago

Yes. It should be inserted wherever it's recognized as a Microsoft.OData.UriParser.QueryNode.

BlaiseD commented 1 year ago

I'm assuming the following was deleted by you?

"I wonder if it's a bug in the OData parser as it never gets recognized as a QueryNode".

I don't see it in the thread.

The reference should be here. Then special handling for enums within that code.

Also, the OData repository could be helpful if it works at the root level.

wbuck commented 1 year ago

Yeah I deleted those because I misunderstood the question initially. I have things mostly working but I hit a bit of a road block when it comes to Nullable types at run time.

wbuck commented 1 year ago

This is a bit of an interesting problem. Currently we have the following filter we use in a test:

var filter = GetFilter<DataTypes>("NullableSimpleEnumProp in ('First', 'Second')");

We're taking a Nullable enum and checking to see if a List<Nullable<enum>> contains said value. What's complicated here is dealing with the conversion from a Nullable enum value to the Nullable underlying type (an int in this case).

There isn't a way to convert an int constant to its nullable counterpart without boxing which coincidently strips away Nullable (which makes sense).

So, in the case of a Nullable enum type in a filter I could just box the underlying enum value and compare that to a List<object{underlying value}> .

Or, I could build a more complicated IExpressionPart where I convert the Nullable<enum> member to Nullable<underlying type> but wrap that in a conditional. So something like this pseudo code:

List<int> values = new() {...};
Nullable<int> prop == null ? false : values.Contains((int)prop);
BlaiseD commented 1 year ago

Doubt you need a new IExpressionPart. This works in the utils library:

        [Fact]
        public void EnumInExpression_NullableEnum_WithNullable_IntegerList()
        {
            //act
            var filter = CreateFilter<DataTypes>();
            var constant = (ConstantExpression)((MethodCallExpression)filter.Body).Arguments[0];
            var values = (IList<int?>)constant.Value;

            //assert
            AssertFilterStringIsCorrect(filter, "$it => System.Collections.Generic.List`1[System.Nullable`1[System.Int32]].Contains(Convert($it.NullableSimpleEnumProp))");
            Assert.Equal(new int?[] { 1, 2 }, values);

            Expression<Func<T, bool>> CreateFilter<T>()
                => GetFilter<T>
                (
                    new InOperator
                    (
                        new ConvertOperator 
                        (
                            new MemberSelectorOperator("NullableSimpleEnumProp", new ParameterOperator(parameters, parameterName)), 
                            typeof(int?)
                        ),
                        new CollectionConstantOperator(new List<object> { 1, 2 }, typeof(int?))
                    ),
                    parameters
                );
        }

The cast should probably come from the request. In the worst case the FilterHelper will need an update.

wbuck commented 1 year ago

By default I'm casting all enum operations to their underlying type. So this: SimpleEnumProp in ('First', 'Second') becomes: int in (1, 2) unless the user has specifically requested a cast.

Same with this: SimpleEnumProp eq 'First becomes int eq 1, etc.

If a cast was requested nothing is changed, so this: cast(SimpleEnumProp, Edm.String) eq 'First' is left as is as it will do the right thing.

Actually, the test you should is essentially what I changed in the FilterHelper, except it does determines if the type is Nullable or not.

Just tested this and it works well, thank-you.

wbuck commented 1 year ago

I'm a bit confused by the ConvertToStringOperator.

If the user specified cast(EnumProp, Edm.String) I believe the expectation would be to convert the variant name to string and not the variants underlying integer value.

So, if the user was storing their enum's as string's in the DB they'd be unable to perform the following query:

Query cast(EnumProp, EdmString) eq 'SomeValue'

Expected Expression $it.EnumProp.ToString() == "SomeValue"

Or am I not understanding something?

I believe the ConvertToStringOperator should be:

public sealed class ConvertToStringOperator : IExpressionPart
{
    private readonly IExpressionPart source;

    public ConvertToStringOperator(IExpressionPart source) =>        
        this.source = source;

    public Expression Build() => Build(this.source.Build());

    private Expression Build(Expression expression) => expression.Type switch
    {
        Type type when type.IsNullableType() => ConvertNullable(expression),
        _ => expression.GetObjectToStringCall()
    };

    private static Expression ConvertNullable(Expression expression)
    {
        Expression memberAccess = expression.MakeValueSelectorAccessIfNullable();
        return Expression.Condition
        (
            expression.MakeHasValueSelector(),
            memberAccess.GetObjectToStringCall(),
            Expression.Constant(null, typeof(string))
        );
    }
}
BlaiseD commented 1 year ago

After any changes, the functionality should more closely mirror the OData repo rather than stray from it.

I'm quite sure the SQL Server provider has trouble translating - among other things - MyEnumInstance.ToString() into SQL. Here's an example.

I think, you're correct - we may want to submit a separate PR for changes to the existing code before the new stuff.

wbuck commented 1 year ago

Now, in the issue you linked to I think the issue was with using Enum.TryParse which I would expect EF to choke on. I've used ToString in the past with the SQL and obviously the Cosmos provider and it seemed to work correctly.

Currently in order to support enum's and the $this literal in subexpressions I had to make a few changes to the FilterHelper. This includes the inclusion of two new operators.

The ToStringConvertOperator which I've shown above and the ConvertEnumToUnderlyingOperator. I'm using the ConvertEnumToUnderlyingOperator here and here.

The reason behind the location was attempting to convert the enum to its underlying representation got really messy when dealing with cast's. The algorithm currently works on each node of the tree independently, so you don't know what came before or what comes after so you can't make an intelligent decision on whether you should or should not convert the enum to its underlying representation.

Do you want me to make the changes to the FilterHelper in a separate PR? This way if something goes awry it will be easier to determine which PR caused the issue.

wbuck commented 1 year ago

OK what I'm going to do is create two new issues (for tracking purposes).

The first issue will be adding support for the $this literal. The second issue will be about how we handle enum's in a subquery.

wbuck commented 1 year ago

@BlaiseD So I was thinking of submitting a WIP PR for this feature. This would allow you to get some insight in to what I've been doing over the last couple months and will allow you to ask questions or raise some concerns you may have.

I'm still in progress with this feature but for the most part it's fully functional.

One thing I'm still on the fence about is how to handle queries such as top, skip and orderby is subqueries. Cosmos DB does not support those in sub queries so I'm unsure if I should just leave what's currently there and have Cosmos DB return an error or if I should throw a meaningful exception.

Regardless, I just wanted to get a WIP PR in so that you could see what's happening.

BlaiseD commented 1 year ago

Regarding unsupported features I would look at what ASP.NET Odata does - not sure if there's any special handling.

For WIP - better to submit a PR for the shared/existing classes first then a second one for the remaining Cosmos DB specific code.

wbuck commented 1 year ago

@BlaiseD So I haven't made any significant changes to any existing code. The changes that were made are just some minor extension methods added to TypeExtensions and I think that's it.

I tried to keep everything as self contained as possible without code duplication.