cap-js / cds-dbs

Monorepo for SQL Database Services for CAP
https://cap.cloud.sap/docs/
Apache License 2.0
32 stars 9 forks source link

odata `groupBy` causes `subquery uses ungrouped column "__select__.author_id" from outer query` #708

Closed hakimio closed 5 days ago

hakimio commented 1 week ago

Description of erroneous behaviour

CDS server crashes with error TypeError: transformedQuery.SELECT.from.ref is not iterable when trying to groupBy on a child association.

Stacktrace:

[cds] - TypeError: transformedQuery.SELECT.from.ref is not iterable
    at expandColumn (node_modules\@cap-js\db-service\lib\cqn4sql.js:790:41)
    at handleExpand (node_modules\@cap-js\db-service\lib\cqn4sql.js:420:40)
    at node_modules\@cap-js\db-service\lib\cqn4sql.js:354:32
    at getTransformedColumns (node_modules\@cap-js\db-service\lib\cqn4sql.js:382:21)
    at transformSelectQuery (node_modules\@cap-js\db-service\lib\cqn4sql.js:156:41)
    at cqn4sql (node_modules\@cap-js\db-service\lib\cqn4sql.js:73:26)
    at PostgresService.cqn4sql (node_modules\@cap-js\db-service\lib\SQLService.js:375:12)
    at PostgresService.cqn2sql (node_modules\@cap-js\db-service\lib\SQLService.js:351:18)
    at PostgresService.onSELECT (node_modules\@cap-js\db-service\lib\SQLService.js:127:39)
    at PostgresService.onSELECT (node_modules\@cap-js\postgres\lib\PostgresService.js:314:18) {
  id: '1043785',
  level: 'ERROR',
  timestamp: 1719308262527
}

Detailed steps to reproduce

Try to run an odata query similar to the following:

$apply: filter(foo eq 'bar')/groupby((user/name),aggregate(amount with sum as totalAmount))

If you change groupby((user/name) with groupby((user_ID) it works as expected.

Details about your project

Package Version
OData version v4
@cap-js/cds-typer 0.22.0
@cap-js/cds-types 0.2.0
@cap-js/postgres 1.8.0
@sap/cds 7.9.2
@sap/cds-compiler 4.9.4
@sap/cds-dk 7.9.3
@sap/cds-dk (global) 7.9.2
@sap/cds-fiori 1.2.4
@sap/cds-foss 5.0.1
@sap/cds-mtxs 1.18.1
@sap/eslint-plugin-cds 3.0.4
Node.js v20.13.1
patricebender commented 1 week ago

Hi, could you please let me know which version of the db-service you are using (its not part of the version table you've provided)

hakimio commented 1 week ago

My db-service is at v1.10.1. BTW, just checked capire and this query should work correctly: https://cap.cloud.sap/docs/advanced/odata#example-1

hakimio commented 1 week ago

Also, is odata v4 filter in implemented? Doesn't seem to work for me either. My CDS config:

    "features": {
        "assert_integrity": "db",
        "odata_new_parser": true,
        "odata_new_adapter": false,
        "string_decimals": false
    }
patricebender commented 1 week ago

My db-service is at v1.10.1. BTW, just checked capire and this query should work correctly: https://cap.cloud.sap/docs/advanced/odata#example-1

yes, this should be supported. I have identified your issue and prepared a fix in #709 Once this change is merged, I will ship a release.

Thank you for reporting this issue 😊

Also, is odata v4 filter in implemented? Doesn't seem to work for me either. My CDS config:

    "features": {
        "assert_integrity": "db",
        "odata_new_parser": true,
        "odata_new_adapter": false,
        "string_decimals": false
    }

Sorry, I can't help you as I'm no expert with the ODATA adapter. Maybe @johannes-vogel can help you out.

hakimio commented 1 week ago

@patricebender unfortunately this is still not working:

Query:

$apply: filter(start ge 2024-06-01T00:00:00Z and start le 2024-06-30T23:59:59Z)/groupby((user/name),aggregate(amount with sum as totalAmount))

Error:

[cds] - error: subquery uses ungrouped column "__select__.user_id" from outer query
    at node_modules\pg\lib\client.js:526:17
    at processTicksAndRejections (node:internal/process/task_queues:95:5)
    at async Object.all (node_modules\@cap-js\postgres\lib\PostgresService.js:174:26)
    at async PostgresService.onSELECT (node_modules\@cap-js\db-service\lib\SQLService.js:132:16)
    at async next (node_modules\@sap\cds\lib\srv\srv-dispatch.js:68:17)
    at async PostgresService.handle (node_modules\@sap\cds\lib\srv\srv-dispatch.js:66:10)
    at async ApplicationService.<anonymous> (node_modules\@sap\cds\libx\_runtime\common\generic\crud.js:73:16)
    at async next (node_modules\@sap\cds\lib\srv\srv-dispatch.js:68:17)
    at async ApplicationService.handle (node_modules\@sap\cds\lib\srv\srv-dispatch.js:66:10)
    at async ApplicationService.handle (node_modules\@sap\cds\libx\_runtime\common\Service.js:87:16) {
  length: 152,
  severity: 'ERROR',
  code: '42803',
  detail: undefined,
  hint: undefined,
  position: '255',
  internalPosition: undefined,
  internalQuery: undefined,
  where: undefined,
  schema: undefined,
  table: undefined,
  column: undefined,
  dataType: undefined,
  constraint: undefined,
  file: 'parse_agg.c',
  line: '1450',
  routine: 'check_ungrouped_columns_walker',
  query: 'SELECT to_jsonb(root.*) as _json_ FROM (SELECT jsonb(root."user") as "user",root.totalAmount as "totalAmount" FROM (SELECT (SELECT to_jsonb(user2.*) as _json_ FROM (SELECT user2.name as "name" FROM (SELECT user2.name FROM CrmService_Users as user2 WHERE __select__.user_ID =
 user2.ID LIMIT $1) as user2) as user2) as "user",sum(__select__.amount) as totalAmount FROM (SELECT TimeTransactions.ID,TimeTransactions.oldCrmId,TimeTransactions.createdAt,TimeTransactions.createdBy,TimeTransactions.modifiedAt,TimeTransactions.modifiedBy,TimeTransactions.timeVault_
ID,TimeTransactions.amount,TimeTransactions.user_ID,TimeTransactions.comment_ID,TimeTransactions.start,TimeTransactions."end",TimeTransactions.type,TimeTransactions.topic_ID FROM CrmService_TimeTransactions as TimeTransactions WHERE TimeTransactions.start >= $2 and TimeTransactions.start <= $3 ORDER BY TimeTransactions.ID ASC) as __select__ left JOIN CrmService_Users as "user" ON "user".ID = __select__.user_ID GROUP BY "user".name LIMIT $4) as root) as root\n' +
    ' ^',
  id: '1366272',
  level: 'ERROR',
  timestamp: 1719383922927
}
patricebender commented 1 week ago

I was able to reproduce your issue. The problem is - as the error states - that the group by clause of the outer query is missing the author.ID which is used to build the correlated subquery for the expand.

Long story short, I was able to reproduce your issue, in my bookshop.

  1. http://localhost:4004/admin/Books?$apply=filter(title%20ne%20%27bar%27)/groupby((author/name))
  2. observe the error: subquery uses ungrouped column "__select__.author_id" from outer query

subquery:

SELECT author2.name
FROM AdminService_Authors as author2
WHERE __select__.author_ID = author2.ID
LIMIT $1

main query __select__:

SELECT (
-- expand subquery from above
) as author
FROM (
   

) as __select__
left JOIN 

GROUP BY author.name -- this is missing the author.ID

For me, including the author/ID in my odata request, did the trick

- http://localhost:4004/admin/Books?$apply=filter(title%20ne%20%27bar%27)/groupby((author/name))
+ http://localhost:4004/admin/Books?$apply=filter(title%20ne%20%27bar%27)/groupby((author/name,author/ID))
{
  "value": [
    {
      "author": {
        "ID": 150,
        "name": "Edgar Allen Poe"
      }
    },
    {
      "author": {
        "ID": 107,
        "name": "Charlotte Brontë"
      }
    },
    {
      "author": {
        "ID": 101,
        "name": "Emily Brontë"
      }
    },
    {
      "author": {
        "ID": 170,
        "name": "Richard Carpenter"
      }
    }
  ],
  "@odata.context": "$metadata#Books(author(name,ID))"
}

Could you try that? I will ask my colleagues if we can give you some convenience for this in the future, as this is a technical detail which one might not expect.

hakimio commented 1 week ago

While yes, including author/ID, is a valid workaround which works for me as well, groupby(author/name) should also work since it is documented in the capire docs and odata docs. In this particular case groupby(author/name) and groupby(author/name, author/ID) are equivalent but there are cases when you want to group by attributes which will not have the equivalent aggregation result.

patricebender commented 1 week ago

Of course, you are correct. I tried the same on SQlite and there doesn't seem to be a problem. I will bring this to the team and we will discuss this. Hope you can continue with the workaround, for now.

hakimio commented 1 week ago

Unfortunately, because of the mentioned limitation, the workaround doesn't cover all my use-cases.

hakimio commented 1 week ago

Here is another groupby query which results in a different error:

Query:

$apply=filter(title ne 'bar')/groupby((author/country/name, author/country/ID))

Error:

[cds] - Error: An association can't be used as a value in an expression
    at rejectAssocInExpression (node_modules\@cap-js\db-service\lib\cqn4sql.js:1540:13)
    at assertNoStructInXpr (node_modules\@cap-js\db-service\lib\cqn4sql.js:1534:7)
    at getTransformedTokenStream (node_modules\@cap-js\db-service\lib\cqn4sql.js:1396:11)
    at cqn4sql (node_modules\@cap-js\db-service\lib\cqn4sql.js:64:31)
    at transformSubquery (node_modules\@cap-js\db-service\lib\cqn4sql.js:969:12)
    at getTransformedFrom (node_modules\@cap-js\db-service\lib\cqn4sql.js:1578:25)
    at cqn4sql (node_modules\@cap-js\db-service\lib\cqn4sql.js:69:51)
    at PostgresService.cqn4sql (node_modules\@cap-js\db-service\lib\SQLService.js:375:12)
    at PostgresService.cqn2sql (node_modules\@cap-js\db-service\lib\SQLService.js:351:18)
    at PostgresService.onSELECT (node_modules\@cap-js\db-service\lib\SQLService.js:127:39) {
  id: '1290165',
  level: 'ERROR',
  timestamp: 1719421678688
}
patricebender commented 1 week ago

Here is another groupby query which results in a different error:

Query:

$apply=filter(title ne 'bar')/groupby((author/country/name, author/country/ID))

Error:

[cds] - Error: An association can't be used as a value in an expression
    at rejectAssocInExpression (node_modules\@cap-js\db-service\lib\cqn4sql.js:1540:13)
    at assertNoStructInXpr (node_modules\@cap-js\db-service\lib\cqn4sql.js:1534:7)
    at getTransformedTokenStream (node_modules\@cap-js\db-service\lib\cqn4sql.js:1396:11)
    at cqn4sql (node_modules\@cap-js\db-service\lib\cqn4sql.js:64:31)
    at transformSubquery (node_modules\@cap-js\db-service\lib\cqn4sql.js:969:12)
    at getTransformedFrom (node_modules\@cap-js\db-service\lib\cqn4sql.js:1578:25)
    at cqn4sql (node_modules\@cap-js\db-service\lib\cqn4sql.js:69:51)
    at PostgresService.cqn4sql (node_modules\@cap-js\db-service\lib\SQLService.js:375:12)
    at PostgresService.cqn2sql (node_modules\@cap-js\db-service\lib\SQLService.js:351:18)
    at PostgresService.onSELECT (node_modules\@cap-js\db-service\lib\SQLService.js:127:39) {
  id: '1290165',
  level: 'ERROR',
  timestamp: 1719421678688
}

I was not able to reproduce this specific scenario (can you provide a sample?). However, with the same query I was running in a different error which is fixed by #715, I will continue investigating in keep you updated. I'm afraid, for the postgres issue, we do not have a quick fix available. I already found out that SQlite can work with the exact same query. Once #712 is fixed, I will also give this a try on HANA.

hakimio commented 1 week ago

I'll just wait for the fix in #715 to land and if the error still persists, I'll see if I can make a better reproduction.

hakimio commented 1 week ago

@patricebender a small update: the previous error was unrelated to groupby, it was caused by a filter similar to filter(author ne null) which, based on the error, I assume is invalid.

Coming back to groupby issue. groupby((author/country/name, author/country/ID)) throws an error, but groupby((author/ID, author/country/name, author/country/ID)) works as intended.

On slightly related note, can you tell me if the totalValue calculated column is valid in the following cds schema:

entity Customers : cuid, managed {
    contracts       : Composition of many Contracts
                          on contracts.customer = $self;
    totalValue      : Decimal(15, 2) = sum(
        contracts[status = 'Active' and orderProduct.product.isRecurring = true].orderProduct.price
    );
}

entity Contracts : cuid, managed {
    status               : ContractStatus;
    orderProduct         : Association to OrderProducts;
}

entity OrderProducts : cuid, managed {
    product           : Association to Products;
}

entity Products : cuid, managed {
    isRecurring                : Boolean default true;
}

No errors shown by the SAP language service, but trying to generate migration with cds deploy --dry --delta-from cds-model.csn > delta.sql throws an error:

crm-service.cds:14:44: Error: “CrmService.Customers”: “product” is not foreign key of managed association “orderProduct” in path “orderProduct.product.isRecurring” (in entity:“CrmService.Customers”/query:1)
crm-service.cds:14:44: Error: “CrmService.Customers”: “isRecurring” is not foreign key of managed association “product” in path “orderProduct.product.isRecurring” (in entity:“CrmService.Customers”/query:1)
    at throwWithError (node_modules\@sap\cds-compiler\lib\base\messages.js:559:13)
    at translateAssocsToJoins (node_modules\@sap\cds-compiler\lib\transform\translateAssocsToJoins.js:72:3)
    at translateAssocsToJoinsCSN (node_modules\@sap\cds-compiler\lib\transform\translateAssocsToJoins.js:26:3)
    at handleAssocToJoins (node_modules\@sap\cds-compiler\lib\transform\forRelationalDB.js:694:20)
    at Proxy.transformForRelationalDBWithCsn (node_modules\@sap\cds-compiler\lib\transform\forRelationalDB.js:227:5)
    at csnForSql (node_modules\@sap\cds-compiler\lib\api\main.js:209:42)
    at sqlMigration (node_modules\@sap\cds-compiler\lib\api\main.js:476:20)
    at Function.api [as migration] (node_modules\@sap\cds-compiler\lib\api\main.js:1127:22)
    at Function.migration (node_modules\@sap\cds-compiler\lib\main.js:142:41)
    at Object.deltaSql (node_modules\@sap\cds\lib\compile\cdsc.js:162:53)
patricebender commented 1 week ago

compiling your model with cds compile -2 sql does not work:

❯ cds compile -2 sql e.cds
[ERROR] e.cds:35:13: “V”: “isRecurring” is not foreign key of managed association “product” in path “orderProduct.product.isRecurring” (in entity:“V”/query:1)
[ERROR] e.cds:35:13: “V”: “product” is not foreign key of managed association “orderProduct” in path “orderProduct.product.isRecurring” (in entity:“V”/query:1)

The calculated element is not valid. That is because it navigates to a non-foreign-key in path orderProduct.product.isRecurring where the association product is not fk of orderProduct and isRecurring is not a fk element of product.

This is a known limitation. The filter path along contracts[
].orderProduct induces a join by itself, with [
] being part of the join node.

this would be the resulting view if each element along the path in question would be a foreign key element:

CREATE VIEW V AS
SELECT Customers_0.ID,
  sum(orderProduct_2.price) AS totalValue
FROM (
    (
      Customers AS Customers_0
      LEFT JOIN Contracts AS contracts_1 ON ((contracts_1.customer_ID = Customers_0.ID))
      AND (
        contracts_1.status = 'Active'
        AND contracts_1.orderProduct_product_isRecurring = TRUE
      )
    )
    LEFT JOIN OrderProducts AS orderProduct_2 ON (contracts_1.orderProduct_ID = orderProduct_2.ID)
    AND (
      contracts_1.orderProduct_product_ID = orderProduct_2.product_ID
    )
    AND (
      contracts_1.orderProduct_product_isRecurring = orderProduct_2.product_isRecurring
    )
  )

→ The filter path is part of the join. If the filter condition itself induces joins (due to non-foreign-key relations), things get much more complex.

hakimio commented 1 week ago

I see. Thank you for the detailed explanation.

hakimio commented 3 days ago

@patricebender any idea when the new version will be released?