DataObjects-NET / dataobjects-net

https://dataobjects.net
MIT License
60 stars 23 forks source link

Make CompiledQueryRunner public: allow compiling queries without executing them #294

Open ondrejtucny opened 1 year ago

ondrejtucny commented 1 year ago

My use case: I have a facility that retrieves data from many entities, and allows for several usage patterns, such as: 1) retrieve page-by-page, 2) retrieve all, 3) retrieve range (skip+take), or 4) retrieve one item. In the beginning, I don't know which usage pattern it is going to be — this is up to a remote client calling my service. The client does specify retrieval conditions, such as filters and sort order. Hence, I need to construct the query dynamically.

Hence, the call sequence of my facility is, for example, as follows:

  1. Initialize(setup) — setting up the query
  2. GetCount() — retrieve total amount of items
  3. GetPage(index) — retrieve a page of data
  4. GetPage(index) — retrieve another page of data
  5. GetDetail(id) —retrieve some more details for a particular item

Now in the Initialize() method I want to prebuild queries which I will need later. I do not want to execute them, just prepare and compile the query. Later, when (if) the client calls the GetPage() method, I execute the relevant query, submitting the page index.

Currently, I have to delay query construction until the first invocation of the method which uses it, e.g. GetPage(). For my case, this is suboptimal, because the client can freely mix different usage patterns, and because my Initialize() method can run in parallel, I would like to prepare all needed queries at once. Another important aspect is that building the query is a fairly complex task, due to the user-submitted filter, and I want to compute this just once as a base, and reuse for all three queries.

I believe that either making CompiledQueryRunner, or adding a CreateCompiledQuery() method on QueryEndpoint could solve this case.

alex-kulakov commented 1 year ago

I'm not sure that for this case we will open internal infrastructure completely. Making this public might tight our hands for free changes of the class.

Do you use session.Query.Execute(q => q.All()) methods which caches queries or regular session.Query.All()?

alex-kulakov commented 1 year ago

Do you know about ability to override QueryRoot? Maybe you find it useful. What it does in simple words is basically replace session.Query.All() with session.Query.All().Where(e => e.IsSomeFlag). After overriding all the queries within overriding region will have .Where() part. It could be one of choices for your case, couldn't it?

You can take a look at examples here in Xtensive.Orm.Tests.Linq.QueryRootOverriding

ondrejtucny commented 1 year ago

Maybe I am missing something. My goal is to make a pre-compiled (cached) query, which can be reused many times with actual parameters inserted at execution time.

I did use session.Query.Execute(…), but (a) that does not allow me to submit arguments from outside, and (b) does not allow me to pre-compile a query for execution later. By 'later' I mean from the context of another request coming to the back-end.

alex-kulakov commented 1 year ago

I have a feeling that we both are missing something.

The cache, which is used by CompiledQueryRunner, stores complete queries (except for parameters what can be changed and some other things). That means the structure of a cached query is the same every time calling code refers to it to be executed. Queries with dynamic filters are hard to cache, for fully dynamic it might be not worth it.

For example, this code, which returns filtered customers in form of DTOs and where filteringValues parameter contains field names and corresponding collection of values the field should be:

    public IReadOnlyList<CustomerListItemDto> GetAllFilteredBy(IDictionary<string, IReadOnlyList<object>> filteringValues)
    {
      using (var tx = session.OpenTransaction()) {
        var baseQuery = session.Query.All<Customer>();
        foreach (var filter in filteringValues) {
          baseQuery = baseQuery.Where(ConstructExpression(filter.Key, filter.Value));
        };
        return baseQuery.Select(c => c.ToListItemDto()).ToList();
      }
    }

    private Expression<Func<Customer, bool>> ConstructExpression(string fieldName, IReadOnlyList<object> values)
    {
      var customerType = typeof(Customer);
      var customerTypeInfo = session.Domain.Model.Types[customerType];
      var filteredField = customerTypeInfo.Fields[fieldName];
      var property = filteredField.UnderlyingProperty;
      var valueType = filteredField.ValueType;

      var p = Expression.Parameter(customerType, "c");
      if (values.Count == 0)
        throw new ArgumentException();

      var fieldAccess = Expression.Property(p, property);

      Expression resultFilter = null;
      foreach (var item in values)
        if (resultFilter!=null)
          resultFilter = Expression.OrElse(resultFilter, Expression.Equal(fieldAccess, Expression.Convert(Expression.Constant(item), valueType)));
        else
          resultFilter = Expression.Equal(fieldAccess, Expression.Convert(Expression.Constant(item), valueType));  
        return (Expression<Func<Customer, bool>>) Expression.Lambda(resultFilter, p);
      }

It is hard to cache because number of fields in the dictionary can vary, number of values per every field can vary. The only way to reuse something which is in cache is to do additional calculations on client side which is not very good - waste of memory and time.

Second thing is that nothing in CompiledQueryRunner would allow you to just cache query if we did it public. It has public Execute methods and public CreateDelayedQuery methods. The second group might confuse you, but it does not create a query that can be cached, it's a way to create batches of SQL SELECT statements to execute them at once.

Knowing all above, I don't think that making CompiledQueryRunner public class is a good idea.

Also, I hardly can imagine the structure of your service to help you with a better solution, I'm aware of the approach you have because it is similar to ASP.NET MVC controllers and actions within them - parallel execution, parameters in action, randomly accessed actions (one of our websites uses it), but implementations can differ. If you want me to consult you it is possible but in this case, I will need at least general design (this discussion should be not in the Issues of our repository).

ondrejtucny commented 11 months ago

This is related: #125 and #142.

alex-kulakov commented 10 months ago

This PR was a concept I played with for one of our customers, main reason of publishing it was to get some feedback form this customer and, unfortunately, I didn't get it. Maybe you have some feedback on what this PR contains, mainly from user side, whether it is usable and what drawbacks/inconveniences/etc. it has. I would appreciate it. And maybe it will get to master with some adjustments to be useful for not only you and me but other DO users.