cap-js / cds-dbs

Monorepo for SQL Database Services for CAP
https://cap.cloud.sap/docs/
Apache License 2.0
37 stars 12 forks source link

PostgreSQL - Total size of jsonb array elements exceeds the maximum of 268435455 bytes #565

Closed AnsgarLichter closed 7 months ago

AnsgarLichter commented 8 months ago

Description of erroneous behaviour

I am running a PostgreSQL instance locally and want to test some things with the PostgreSQL for which I do need some amount of data in my database.

In the respective entity I do have more than 1 million entries which look like this:

ID,displayId,bookingDate,currencyCode_code,bookingStatus_code,customer_ID,travel_ID,carrier_ID,connection_ID,flightDate,flightPrice
83c3db69-d7d5-485c-9136-7bb994cf4697,1,2023-05-27,BZD,N,734c7cda-5bad-4857-8461-05b9d5e8335a,31164f99-17e0-4b6f-bd74-0ed8fd93df99,a848301c-a1ab-4025-affa-589f80a9998a,43ea91c2-a41b-4eea-9183-9110cddd8606,2024-04-10,4802
8e000a2c-67b8-4d6c-ba26-90f0a10cfc2e,2,2023-05-27,BZD,N,0fda0cc4-df23-47b9-ad20-950460d0824b,31164f99-17e0-4b6f-bd74-0ed8fd93df99,a848301c-a1ab-4025-affa-589f80a9998a,43ea91c2-a41b-4eea-9183-9110cddd8606,2024-04-10,4802
462e72b9-f2db-474b-991c-1f53d6eca9f3,3,2023-10-22,RWF,X,b140fc18-bd51-4b5d-ab8b-2f4a9da0b128,ea31605c-47a7-42df-9d82-bca521cd9eb2,a848301c-a1ab-4025-affa-589f80a9998a,43ea91c2-a41b-4eea-9183-9110cddd8606,2024-04-10,4802

I generate the data with a custom script and save it to a CSV file. When deploying this to SQLite everything works fine. When I try to deploy this to Postgres I get the following error message:

error: in cds.deploy(): total size of jsonb array elements exceeds the maximum of 268435455 bytes
Query {
  UPSERT: {
    into: 'com.thesis.travel.Bookings',
    columns: [
      'ID',                 'displayId',
      'bookingDate',        'currencyCode_code',
      'bookingStatus_code', 'customer_ID',
      'travel_ID',          'carrier_ID',
      'connection_ID',      'flightDate',
      'flightPrice',        'createdAt',
      'createdBy',          'modifiedAt',
      'modifiedBy'
    ],
    rows: [
      [
        '662ebddd-8d38-4964-a3f2-6e64bd07f7cf',
        '1',
        '2023-06-05',
        'XPF',
        'X',
        '98fbb56b-79bb-49b8-ba3c-7466d89990a3',
        'f04197ff-d26b-432f-842a-48b85df73724',
        '7180855f-73b6-4f8b-8c9a-c82251ce59c3',
        '6cf6651c-df1c-4468-ac59-39b63af5aa61',
        '2024-04-10',
        '4674',
        '2024-04-03T14:21:02.155Z',
        'anonymous',
        '2024-04-03T14:21:02.155Z',
        'anonymous'
      ],
      ...
      ... 1044373 more items
    ]
  }
}
    at /Users/I573017/Documents/Development/Work/Repositories/postgresql-vs-hana/node_modules/pg/lib/client.js:526:17
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
    at async Object.run (/Users/I573017/Documents/Development/Work/Repositories/postgresql-vs-hana/node_modules/@cap-js/postgres/lib/PostgresService.js:152:24)
    at async Promise.all (index 0)
    at async PostgresService.onUPSERT (/Users/I573017/Documents/Development/Work/Repositories/postgresql-vs-hana/node_modules/@cap-js/db-service/lib/SQLService.js:171:31)
    at async next (/Users/I573017/Documents/Development/Work/Repositories/postgresql-vs-hana/node_modules/@sap/cds/lib/srv/srv-dispatch.js:68:17)
    at async next (/Users/I573017/Documents/Development/Work/Repositories/postgresql-vs-hana/node_modules/@sap/cds/lib/srv/srv-dispatch.js:68:17)
    at async next (/Users/I573017/Documents/Development/Work/Repositories/postgresql-vs-hana/node_modules/@sap/cds/lib/srv/srv-dispatch.js:68:17)
    at async PostgresService.handle (/Users/I573017/Documents/Development/Work/Repositories/postgresql-vs-hana/node_modules/@sap/cds/lib/srv/srv-dispatch.js:66:10)
    at async /Users/I573017/Documents/Development/Work/Repositories/postgresql-vs-hana/node_modules/@sap/cds/lib/dbs/cds-deploy.js:165:22 {
  length: 140,
  severity: 'ERROR',
  code: '54000',
  detail: undefined,
  hint: undefined,
  position: undefined,
  internalPosition: undefined,
  internalQuery: undefined,
  where: undefined,
  schema: undefined,
  table: undefined,
  column: undefined,
  dataType: undefined,
  constraint: undefined,
  file: 'jsonb_util.c',
  line: '1671',
  routine: 'convertJsonbArray'
}

As I found out through google this is a hard limit of the PostgreSQL. Is there a workaround how to deploy such huge files or is it possible to split the queries during deployment if the jsonb array gets too large?

Detailed steps to reproduce

  1. git clone github tools I573017/postgresql-vs-hana
  2. npm i
  3. npm run generate-data
  4. Start postgres through docker, file is in docker/pg.yml
  5. cds deploy --profile development-postgres

Details about your project

Remove the lines not applicable, and fill in versions for remaining ones:

postgresql-vs-hana github tools I573017/postgresql-vs-hana
@cap-js/cds-typer 0.19.0
@cap-js/cds-types 0.2.0
@cap-js/postgres 1.6.0
@cap-js/sqlite 1.6.0
@sap/cds 7.8.0
@sap/cds-compiler 4.8.0
@sap/cds-dk (global) 7.8.0
@sap/cds-fiori 1.2.3
@sap/cds-foss 5.0.0
@sap/cds-mtxs 1.17.0
@sap/eslint-plugin-cds 2.6.7
patricebender commented 7 months ago

@BobdenOs are we aware of this limitation? Is there a workaround available?

BobdenOs commented 7 months ago

@AnsgarLichter the purpose of the .csv files is to fill in tables with static content. Usually things like countries or currencies. Most of the time these dataset don't go into the millions of rows. So I don't think this is a realistic issue.

I do think that it is important to consider the jsonb limitation in general. Therefor I have made a PR that switches back to the json type. Which I have tested does not come with this limitation, but does seem to slow down when going over the size limit of jsonb. When running the test locally the Postgres docker instance was not enjoying the amount of data. With messages like:

checkpoints are occurring too frequently (25 seconds apart)

If you could please give the change a try to verify that it also works with your project.

BobdenOs commented 7 months ago

Additional errors encountered while testing and using jsonb:

  1. error: invalid memory alloc request size 1073741824
  2. out of memory
AnsgarLichter commented 7 months ago

@BobdenOs is this change included in v1.7.0 of the PostgreSQL adapter? In the release notes this change was not mentioned. If yes, I can test it.

BobdenOs commented 7 months ago

@AnsgarLichter it is not yet released. It is currently planned to be included with 1.7.1.: https://github.com/cap-js/cds-dbs/pull/587