Open ataft opened 1 year ago
I seem to be encountering a similar limit of around 1,000,000 rows, with data that has 6 columns. But instead of limiting the rows added, I get an error instead. Below is the error message found in the EPS logs:
[Error: Promise was abandoned]
The issue can be reproduced using one of the WDC templates. This runs out of memory and is roughly 3800000 rows x 3 cols:
import { Fetcher, FetchOptions, log } from '@tableau/taco-toolkit/handlers'
export default class MyFetcher extends Fetcher {
async *fetch({ handlerInput }: FetchOptions) {
let i = 0
while (i < 10000000) {
const batch = []
for (let j = 0; j < 10000; j++) {
batch.push({
id: `id_${i}`,
mag: Math.random() * 10,
title: `title_${i}`,
})
i++
}
yield batch
log(i)
}
}
}
import {
DataContainer,
DataType,
log,
ParseOptions,
Parser,
} from '@tableau/taco-toolkit/handlers'
export default class MyParser extends Parser {
parse(fetcherResult: any, { dataContainer }: ParseOptions): DataContainer {
const tableName = 'My Sample Data'
log(`parsing started for '${tableName}'`)
const containerBuilder = Parser.createContainerBuilder(dataContainer)
const { isNew, tableBuilder } = containerBuilder.getTable(tableName)
if (isNew) {
tableBuilder.setId('mySampleData')
tableBuilder.addColumnHeaders([
{
id: 'id',
dataType: DataType.String,
},
{
id: 'mag',
alias: 'magnitude',
dataType: DataType.Float,
},
{
id: 'title',
alias: 'title',
dataType: DataType.String,
},
])
}
tableBuilder.addRows(fetcherResult)
return containerBuilder.getDataContainer()
}
}
However, this results in a different error message:
Error: Isolate was disposed during execution due to memory limit
To add to this, attempting a multi-table approach tables seem to share the overall memory limit. Where as one table will run into this issue with millions of rows, increasing the number of tables can lead to seeing this issue after a few hundred thousand. WDC 3.0 is very limiting if we are limited to tables of 100,000 rows.
I've created a new WDC 3.0 and am calling
tableBuilder.addRows(rows)
with 534,000 rows, but it only adds 100,000. I've also tried chunking the rows and calling addRows in 1,000 chunks, but still the max is 100,000.