LionWeb-io / lionweb-repository

Reference implementation of LionWeb repository
Apache License 2.0
2 stars 1 forks source link

Invalid string length when producing answer to /bulk/retrieve #66

Open ftomassetti opened 1 week ago

ftomassetti commented 1 week ago

With a recent version of the repository, on a new machine, I get this error, which I was not getting before:

Exception while serving request for /bulk/retrieve?depthLimit=99: Invalid string length                                                 
RangeError: Invalid string length                                                                                               
    at JSON.stringify (<anonymous>)                                                                                                    
    at stringify (/home/cis/repos/lionweb-repository/node_modules/express/lib/response.js:1150:12)                                     
    at ServerResponse.json (/home/cis/repos/lionweb-repository/node_modules/express/lib/response.js:271:14)                            
    at ServerResponse.send (/home/cis/repos/lionweb-repository/node_modules/express/lib/response.js:162:21)                            
    at lionwebResponse (file:///home/cis/repos/lionweb-repository/packages/common/dist/apiutil/LionwebResponse.js:11:14)               
    at retrieve (file:///home/cis/repos/lionweb-repository/packages/bulkapi/dist/controllers/BulkApi.js:178:13)                        
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)                                                      
    at async file:///home/cis/repos/lionweb-repository/packages/common/dist/apiutil/functions.js:159:13

We may explore stream serialization, for example using JSONStream

joswarmer commented 1 week ago

The invalid string length means that the retrieve string is too long?

dslmeinte commented 1 week ago

We've seen this one before, I think. Let me search for a bit.

dslmeinte commented 1 week ago

There was #49 . I thought we remedied that already by only using JSON.stringify in batches. Why is that not enough? Should the batch size be 1?

ftomassetti commented 1 week ago

There was https://github.com/LionWeb-io/lionweb-repository/issues/49 . I thought we remedied that already by only using JSON.stringify in batches.

In https://github.com/LionWeb-io/lionweb-repository/pull/50 we built the JSON manually on a buffer and not directly on a string. For example:

        const bufferHolder = new BufferHolder()
        bufferHolder.write("[")
        classifierNodes.forEach((element, index)=>{
            if (index != 0) {
                bufferHolder.write(",")
            }
            bufferHolder.write(`{"language"="${element.language}",`)
            bufferHolder.write(`"classifier"="${element.classifier}",`)
            bufferHolder.write(`"ids"=[`)
            this.serializeInBufferIDsList(bufferHolder, element.ids, limit)
            bufferHolder.write(`], "size"=${element.size}}\n`)
        })
        bufferHolder.write("]")
        return bufferHolder.getBuffer()

This was a manual process applied to a couple of requests. Now we are seeing a similar problem in another request. We could solve it in the same way, but it would be invasive and error-prone, while using a library that does what JSON.stringify does but on a stream, could be simpler, more robust, and easily applicable to all responses.

For example, we could change lionwebResponse:

export function lionwebResponse<T extends LionwebResponse>(response: Response, status: number, body: T): void {
    response.status(status)
    // previously: response.send(body) -> this caused some internal part of express to call JSON.stringify(body)
    new JsonStreamStringify(body).pipe(response);
}

This is something I need to test, but hopefully should do what I did manually in #50 , just better

dslmeinte commented 1 week ago

I agree: JSONStream seems to be what we need, rather than assembling responses from partial JSON.stringifys.