kysely-org / kysely

A type-safe typescript SQL query builder
https://kysely.dev
MIT License
9.81k stars 249 forks source link

Memory issue when inserting lots of records to postgres #1026

Closed 4integration closed 1 month ago

4integration commented 1 month ago

Hi,

I am new to Kysely and trying to insert approx 1.000.000 records which originates from a CSV file. The file is processed fine but when adding DB insert I get out of memory.

How can this be done in a better way to reduce memory usage?

Parsing

async function doStandardTexts() {
    const fs = require('fs');
    var bomstrip = require('bomstrip');

    const csvFile = fs.createReadStream('standard_text_utf8.txt').pipe(new bomstrip());
    let count = 0;
    Papa.parse(csvFile, {
        header: true,
        delimiter: "\t",
        dynamicTyping: true,
        quoteChar: '',
        escapeChar: '',
        step: function(row:any) {
            //console.log("Row:", row.data);
            const standardTextItem = toStandardText(row.data);
            console.log(standardTextItem);
            count +=1;
            insertStandardText(standardTextItem); // <<<<<<<<< this fails with out of memory
        },
        complete: function() {
            console.log("All done!: " + count);
        }
    });
}

insertStandardText(standardTextItem); implementation

export async function insertStandardText(element: any) {
    return await db.insertInto("standard_text").values(element).executeTakeFirst();
}

Database config:

const dialect = new PostgresDialect({
  pool: new Pool({
    database: 'jato',
    host: 'localhost',
    user: 'postgres',
    password: 'admin',
    port: 5432,
    max: 10,
  })
})

export const db = new Kysely<Database>({
  dialect,
})
koskimas commented 1 month ago

🤦 Don't insert a million records at a time.