OData / AspNetCoreOData

ASP.NET Core OData: A server library built upon ODataLib and ASP.NET Core
Other
454 stars 159 forks source link

Odata Response in API with high volume of data from multiple tables #1094

Open DevDavee opened 10 months ago

DevDavee commented 10 months ago

Assemblies affected net6.0 Microsoft.AspNetCore.OData" Version="8.0.10" Microsoft.EntityFrameworkCore Version="6.0.0"

Odata works good when data is fetching from single tables eg Person, Author etc Its features like $count,select,$filter,$top,$skip works fine.

I have a query which is a result of joins from multiple tables and it took more than 5 mins to return the results in DB. 5,00,000 rows. I can't use Store Procedure due to access issues.

I am using raw sql and the results should match all Odata features.

Approach 1 Fetching all the data, it fails as response time is too high from DB `
Controller [EnableQuery] public IQueryable Get() { return _repo.Get();
}

Repo

public IQueryable Get() {

  var result = _context.Entity.FromSqlRaw("select a.id,b.firstName,c.LastName,d.gender from table a inner join table b on a.id= b.id inner join table b on .id=a.id");    
  return result.AsQueryable();

} `

Approach 2

Top N with PageSize `public IQueryable Get() {

  var result = _context.Entity.FromSqlRaw("select a.id,b.firstName,c.LastName,d.gender from table a inner join table b on a.id= b.id inner join table b on .id=a.id limit 10");    
  return result.AsQueryable();

} ` It fetches the query with Top 10 count with Server Side [EnableQuery(PageSize = 5)] Response { "@odata.context": "https://localhost:7086/odata/$metadata#Person", "@odata.count": 10, "value": [] "@odata.nextLink": "https://localhost:7086/odata/Person?$count=true&$skip=5" } On https://localhost:7085/odata/Person?$count=true&$skip=5 it doesnt shows more records as Top 10 is called.

Issues

  1. $count shows 10 which is coming from Top 10 from my query, actual count is 5,00,000 in DB
  2. If some one filters 11th record which exists in DB but not in result as its Top 10, Record is not found. https://localhost:7086/odata/Person?$filter=id eq 11

How do I take care of this situation where I get the actual count of the data and Odata features.

julealgon commented 10 months ago

First, please note that whatever slowdown you are seeing is not caused by OData, but by the database call itself. OData works extremely well with many complex relationships, but you need to optimize your database access patterns if you want to be able to use complex queries with OData over it.

As for handling the count logic, note that OData will automatically call Count on the IQueryable normally before applying filters, but since you are providing a hardcoded raw SQL there, it will end up calling Count on the results of your hardcoded query.

Making OData work 100% with manual SQL statements while possible, is non trivial. You'd have to pass in ODataQueryOptions<T> down to your repository layer and avoid relying on the [EnableQuery] attribute as much as possible.

I would strongly recommend looking into alternative solutions to your issue that still allow you to leverage EFCore and get an IQueryable from it that OData can much more easily manage automatically.

I have a query which is a result of joins from multiple tables and it took more than 5 mins to return the results in DB. 5,00,000 rows.

This points to some heavy inefficiency in your query or database model. I'd start by asking (in more appropriate forums, perhaps dba.stackexchange) on how to optimize your query in the database first. For example, if the data spans several years, leveraging table partitioning can help. If you are filtering by multiple columns, you should have proper indexes in the affected tables covering for those columns.

DevDavee commented 10 months ago

First, please note that whatever....

Yes you are right the issue is from DB side. Odata has PageSize attribute but as per my observation it gets all the data from Db and applies Page Size from in memory and displays those records to the client. Eg Total records in DB is 5000 and Page Size = 50 then it gets all the 5000 rows from the DB and then filters 50 records from the memory and shows it to client(Postman) . PageSize in Odata is Server Side Pagination.

That's why I tried Approach 2 Top N with PageSize Using Top 100 is not a good solution as if use filters 101st record they won't get it as Top 100 is applied and 101st record is available in the DB.

Approach 3 {url}?pageNumber=2&pageSize=2

Am thinking to add paging like this to the query but doubt is :-

  1. If a record exits on the 3 or Nth page which user wants to search will they get it.
  2. OData features like $filter,$order,$count,$select work with {url}?pageNumber=2&pageSize=2?$select {url}?pageNumber=2&pageSize=2?$count {url}?pageNumber=2&pageSize=2?$filter it can be in combinations too. Too many scenarios.
julealgon commented 10 months ago

Odata has PageSize attribute but as per my observation it gets all the data from Db and applies Page Size from in memory and displays those records to the client.

Well... no framework has any means to change the behavior of your explicit SQL there. This is why I mentioned to try to avoid explicit SQL otherwise you have to parameterize everything yourself. OData is not "magical": it usually works on top of IQueryable but before it is sent to the database. The problem with your example is that you are making a query in the database, materializing all the results, and then transforming that into a IQueryable: anything else you apply to that queryable will only ever execute in-memory, because it was already sent to the database in a previous step.

If you really want to use raw SQL with OData, it looks like you'll have to dedicate some time to understanding more of the fundamentals involved here. It is a vastly more advanced scenario than "normal" OData with EF + IQueryable is. I would, again, advise against it.

DevDavee commented 10 months ago

Odata has PageSize attribute but as per my observation it gets all the data from Db and applies Page Size from in memory and displays those records to the client.

Well... no framework has any means to change the behavior of your explicit SQL there. This is why I mentioned to try to avoid explicit SQL otherwise you have to parameterize everything yourself. OData is not "magical": it usually works on top of IQueryable but before it is sent to the database. The problem with your example is that you are making a query in the database, materializing all the results, and then transforming that into a IQueryable: anything else you apply to that queryable will only ever execute in-memory, because it was already sent to the database in a previous step.

If you really want to use raw SQL with OData, it looks like you'll have to dedicate some time to understanding more of the fundamentals involved here. It is a vastly more advanced scenario than "normal" OData with EF + IQueryable is. I would, again, advise against it.

I can't use stored Procedure here so only was is to use Raw SQL with fixed Page Size = 10 eg query limit 10 offset 1 and take Page Size = 1 from the user but then its not Odata Compliant response is missing:- $count can't get as I am passing the whole query with the pagesize.

{ "@odata.context": "https://localhost:7044/odata/$metadata#", }