Error 500 on heavy load.

SimonShapiro commented 5 years ago

I have experienced a strange behaviour over the REST interface using node-fetch module.

While this is not the most efficient way of doing things, it has thrown up the following error 500.

I read in around 11 ttl files so that total about 1,6m triples. I now want to pivot them and create a file per subject. I process each of the 1,6m triples as set out below. Check whether the resource exists. If it does not, create it using PUT, else PATCH in the new triple.

For logging I write out a 'pulse' every 5,000 records.

The code below is an extract of the process.

   let getResult = await fetch(newUrl, {method: "HEAD"})
//                console.log(getResult.status)
                if (getResult.status === 404) {
                    options = {
                        method: "PUT",
                        headers: {
                            "content-type" : "text/turtle",
                            "Link": "<http://www.w3.org/ns/ldp#Resource>; rel='type'"
                        },
                        body: `${newSubject} <${newPredicate}> ${newObject} .`
                   }
                }
                else {
                    options = {
                        method: "PATCH",
                        headers: {
                            "content-type" : "application/sparql-update",
                            "Link": "<http://www.w3.org/ns/ldp#Resource>; rel='type'"
                        },
                        body: `INSERT DATA { ${newSubject} <${newPredicate}> ${newObject} }`
                    }
                };
                let res1 = await fetch(newUrl, options);
                if ((res1.status != 200) && (res1.status != 201)) console.log("PROBLEMS WITH THE SOLID-SERVER", res1.status, JSON.stringify(options));
                if ( i % 5000 === 0) console.log("output", i, newUrl, res1.status);

Just after the 80,000th log entry and before the 85,000th log entry, the server starts producing error 500 and never recovers. As can be seen in the output below:

...
output 75000 https://127.0.0.1:8443/eim/App_Inst/714101 200
output 80000 https://127.0.0.1:8443/eim/App_Inst/876505 200
PROBLEMS WITH THE SOLID-SERVER 500 {"method":"PUT","headers":{"content-type":"text/turtle","Link":"<http://www.w3.org/ns/ldp#Resource>; rel='type'"},"body":"<> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <https://127.0.0.1:8443/eim/App_Instance_Review> ."}
PROBLEMS WITH THE SOLID-SERVER 500 {"method":"PUT","headers":{"content-type":"text/turtle","Link":"<http://www.w3.org/ns/ldp#Resource>; rel='type'"},"body":"<> <https://127.0.0.1:8443/eim/App_Inst_ID> \"\" ."}
...

SimonShapiro commented 5 years ago

I was reading the readme file and I noticed that there is a size limit "Its current function is to set a quota for disk usage of just 25 MB, which is what we can be sure the current prototype can tolerate under load."

I know that the full data set is around 100MB, could this be the reason for the sudden error 500 after running ok for some time?

RubenVerborgh commented 5 years ago

which is what we can be sure the current prototype can tolerate under load

@kjetilk @megoth That claim does not have any grounding. 25MB has to do with disk space on our public servers, not with the server software. Can we fix that?

kjetilk commented 5 years ago

@kjetilk @megoth That claim does not have any grounding. 25MB has to do with disk space on our public servers, not with the server software. Can we fix that?

Well, it is more a heuristic, but I agree that "be sure" is a bit too strongly worded, so I changed the wording.

So, @SimonShapiro , it wouldn't be the reason, it is more that we don't know much about the scalability properties of the server, other than that it doesn't scale well. We have at least one more bug that relates to scalability, #849 , you could check if this is related.

It is still useful to get this kind of reports, to understand better where the limits are, but unless it turns out to be low hanging fruit, we are more likely to focus on rearchitecting the server to better scale.

SimonShapiro commented 5 years ago

If you want me to trap any specific log information, let me know how to do so and I will re-run it.

I prepare all 1.6m triples using the python Rdflib then I convert them to json-ld as part of the pipeline where they become around 150,000 json-ld objects on disk. Currently the json is served with any old httpbserver(msoft iis in production). I was looking to solid to move away from the json-ld strategy.

On Tue, 18 Dec 2018 at 21:49, Kjetil Kjernsmo notifications@github.com wrote:

@kjetilk https://github.com/kjetilk @megoth https://github.com/megoth That claim does not have any grounding. 25MB has to do with disk space on our public servers, not with the server software. Can we fix that?

Well, it is more a heuristic, but I agree that "be sure" is a bit too strongly worded, so I changed the wording.

So, @SimonShapiro https://github.com/SimonShapiro , it wouldn't be the reason, it is more that we don't know much about the scalability properties of the server, other than that it doesn't scale well. We have at least one more bug that relates to scalability, #849 https://github.com/solid/node-solid-server/issues/849 , you could check if this is related.

It is still useful to get this kind of reports, to understand better where the limits are, but unless it turns out to be low hanging fruit, we are more likely to focus on rearchitecting the server to better scale.

— You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub https://github.com/solid/node-solid-server/issues/1008#issuecomment-448383250, or mute the thread https://github.com/notifications/unsubscribe-auth/ABl65_hlcST0yVoeQrHF3ICs3HdI_B9Sks5u6WLYgaJpZM4ZBZmZ .

nodeSolidServer / node-solid-server

Error 500 on heavy load. #1008