LionWeb-io / lionweb-repository

Reference implementation of LionWeb repository
Apache License 2.0
2 stars 1 forks source link

Prevent range error issue #50

Closed ftomassetti closed 2 months ago

ftomassetti commented 2 months ago

Fix #49

We have calls where we return a map from either language or concept to node IDs. The problem is that when we have a lot of Node IDs (hundreds of thousands of Node IDs) we cannot serialize them to JSON. Using JSON.stringify fails, but also building the JSON string directly fails. So these calls now return all the Node IDs if they are below a certain threshold, otherwise they just return their number. We could later introduce a call using pagination to use to ask for all the node ids having a specific concept or language (and not the entire map for all concepts or all languages)

dslmeinte commented 2 months ago

@ftomassetti How does “building the JSON string directly fail”?

ftomassetti commented 2 months ago

@ftomassetti How does “building the JSON string directly fail”?

That also gives a Range Error. I suppose the string is too big. We have several lists of IDs that can have 200K...900K elements, and the IDs themselves can be strings of 30-60 characters.

dslmeinte commented 2 months ago

The RangeError seems to be a result of a (VM-determined) maximum size of strings. Using a Buffer shouldn't have that problem.

ftomassetti commented 2 months ago

The RangeError seems to be a result of a (VM-determined) maximum size of strings. Using a Buffer shouldn't have that problem.

I am not very familiar with TS and I could not find an equivalent to StringBuffer. However another problem is that we are calling a function from express to return the response to the client. Before we were passing the object and let express serialize that. We looked into producing the string and pass it to express, but we cannot build the string. We could use the buffer, but then express should accept that buffer and use it to create the response. Is that express can do? If so, I can update the PR

dslmeinte commented 2 months ago

Express accepts Buffers just fine, luckily — see http://expressjs.com/en/api.html#res.send

ftomassetti commented 2 months ago

I have changed the PR to use a Buffer. I also added the possibility to specify a limit to the number of Node IDs to be returned per language/classifier, so that the user can ensure they get back something "manageable" for their use case

ftomassetti commented 2 months ago

I tried this approach and it seems to be able to send back an answer of over 500 MB. So I would say it is pretty scalable :D

ftomassetti commented 2 months ago

I got one approval, so I will go ahead and merge this. Please let me know if in the future I should wait for the maintainer to do the merging or if I need to follow another policy