vlucas / sheetquery

Query Builder/ORM for Google Sheets
80 stars 12 forks source link

More performant insertRows() #16

Open uberl opened 8 months ago

uberl commented 8 months ago

Hi!

Inserting rows is really slow (3 Minutes for 640 entries :-/ ). Do you see any way to increase performance of insertRows? I really like the simplicity of sheetquery thanks for the project :heart: This is my first post on github, please be gentle. :innocent:

I am also open to implementing the solution myself, if there is one.

This is my apps script code:

function insertRows(rows)
{
  Logger.log(`Inserting ${rows.length} rows.`)

  console.time('insertRows');

  SheetQuery
    .sheetQuery()
    .from("Einsätze")
    .insertRows(rows)

  console.timeEnd('insertRows');

}

These are the logs:

20.03.2024, 13:29:16 Info Inserting 641 rows.
20.03.2024, 13:32:12 Fehlerbehebung insertRows: 175975ms

I am not sure if anything can be improved in sheetquery. The relevant source code is simply a call to appendRow

        sheet.appendRow(rowValues);

Here is the full method from index.ts lines 203+

/**
   * Insert new rows into the spreadsheet
   * Arrays of objects like { Heading: Value }
   *
   * @param {DictObject[]} newRows - Array of row objects to insert
   * @return {SheetQueryBuilder}
   */
  insertRows(newRows: DictObject[]): SheetQueryBuilder {
    const sheet = this.getSheet();
    const headings = this.getHeadings();

    newRows.forEach((row) => {
      if (!row) {
        return;
      }

      const rowValues = headings.map((heading) => {
        const val = row[heading];
        return val === undefined || val === null || val === false ? '' : val;
      });

      // appendRow() will throw if array is empty, so we check to prevent that
      if (rowValues && rowValues.length !== 0) {
        sheet.appendRow(rowValues);
      }
    });

    return this;
  }

    return this;
  }
vlucas commented 3 months ago

Yes, it is pretty slow this way (calling appendRow in a loop). There are some faster ways to achieve this, but they don't have the same atomic guarantees, so I have been hesitant to use them.

There are some methods here that you can experiment with if you like: https://stackoverflow.com/questions/44695360/appending-multiple-rows-to-spreadsheet-google-apps-script

If any of those work for you, I'd love to have a contribution back with the working code! :)

uberl commented 2 months ago

I have finished the simple implementation in gs. :-)

function insertRows(newRows) {

  const sheet = this.getSheet()
  const headings = this.getHeadings() 

  const nonNullRows = newRows === undefined || newRows === null ? [] : newRows.filter(row => !!row)

  const rowValues = nonNullRows
    .map(
      row =>
        headings.map((heading) => {
          const val = row[heading];
          return val === undefined || val === null || val === false ? '' : val;
        })
    )

  numberOfNewRows = rowValues.length

  if (numberOfNewRows === 0) {
    return this;
  }

  lastRowIndex = sheet.getLastRow()
  numberOfColumns = headings.length

  //speed up inserts for many rows
  if (numberOfNewRows > 3) {
    const lock = LockService.getScriptLock()
    if (lock.tryLock(500)) {
      sheet.insertRowsAfter(lastRowIndex, nonNullRows.length)
      sheet.getRange(lastRowIndex + 1, 1, numberOfNewRows, numberOfColumns).setValues(rowValues)

      SpreadsheetApp.flush()
      lock.releaseLock()
      return this;
    }
  }
  //if no lock could be acquired in time use the slow insert as a fallback
  rowValues.forEach((rowValue) => sheet.appendRow(rowValue))
  return this;
}

Problems for ts-implementation:

  1. gasmask does not mock the method "getLastRow()"
  2. The type of LockService is not present in the ts-repository.

So at the moment a ts implementation cannot be compiled..

There is a package @types/google-apps-script which probably wasn't available when you created sheetquery providing types for google apps script.

I suggest the following course of action:

I extend the mock ad hoc in sheetquery.test.ts and add @types/google-apps-script as a dev-dependency. Then I will create a PR for you. Would that be ok for you? Do you have a better idea?

New test-scenarios:

  1. insertRows should not error if no lock can be acquired
  2. insertRows inserts many rows correctly

I have thought about the official google types and their usage in gasmasks or sheetquery. I think it would be neat if sheetquery used them. But then gasmask would need to implement all the methods? What are your thoughts?