zazuko / rdf-validate-shacl

Validate RDF data purely in JavaScript. An implementation of the W3C SHACL specification on top of the RDFJS stack.
MIT License
95 stars 12 forks source link

Memory issue with nested results #99

Open MarinusVonhof opened 1 year ago

MarinusVonhof commented 1 year ago

The validation of large datasets can cause memory-problems. The object nestedResult of class ValidationEngine is not cleaned up when validation of a node is successful. Adding a cleaning action in createResultFromObject() solved the problem for me:

// Validation was successful. No result.
if (!validationResultObj) {

  // 20220806/mv: Clean the nested results of the report-children
  **if (this.nestedResults[this.recordErrorsLevel + 1]?.length)
    this.nestedResults[this.recordErrorsLevel + 1] = [];**

  return null
}

But, thanks for the good work! Really a nice module.

tpluscode commented 1 year ago

Hey. Thank you fo reaching out. I would like to hear how big your data and shapes graphs are. Are you able to provide an example?

StichtingRIONED commented 1 year ago

Hello. Yes ok, I'll send you some example-files. They are rather big. For testing-purpose that's ok, but I don't think Javascript is suitable for these large datasets. A full run takes about 6 hours!

MarinusVonhof commented 1 year ago

Hi,

You can find the examples here: [Pictogram example_shacl.zip] example_shacl.ziphttps://marivon-my.sharepoint.com/:u:/g/personal/marinus_vonhof_marivon_nl/EZPTxdiZNuNDlgEyHJqQkI4BwyFqTkHFBzl_t5GhNIl7cg?e=9RRraE. Please keep it for yourself, it contains real (but public, its legal) data.

Best regards, Marinus

Van: Tomasz Pluskiewicz @.> Verzonden: donderdag 6 oktober 2022 11:55 Aan: zazuko/rdf-validate-shacl @.> CC: Marinus @.>; Author @.> Onderwerp: Re: [zazuko/rdf-validate-shacl] Memory issue with nested results (Issue #99)

Hey. Thank you fo reaching out. I would like to hear how big your data and shapes graphs are. Are you able to provide an example?

— Reply to this email directly, view it on GitHubhttps://github.com/zazuko/rdf-validate-shacl/issues/99#issuecomment-1269717482, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AETWJ6GOHO2MHQMFBUQB3RTWB2OVPANCNFSM552FPL7Q. You are receiving this because you authored the thread.Message ID: @.***>

mhusm commented 1 year ago

I get a maxium call size exceeded for relatively small data sets (2.3mb) and shapes. I'm note sure if this is the same issue. I can't share the data set publicly, though. Tomasz, if you reach out to my corporate email, I could share a bit more.

mhusm commented 1 year ago

I've just realized that my problem is likely caused by this issue: https://github.com/zazuko/rdf-validate-shacl/issues/43

I have some reasoning artefacts that lead to cyclic rdfs:subclassOf relationships.