elixir-europe / biovalidator

JSON validator derived from AJV supporting ontology and taxonomy validation.
Apache License 2.0
20 stars 6 forks source link

[BUG]: Big CV lists: ``RangeError: Maximum call stack size exceeded`` #56

Open M-casado opened 1 year ago

M-casado commented 1 year ago

Bug summary

When loading a huge Controlled Vocabulary (CV) list, the amount of items in the enum keyword exceeds the maximum of the call stack.

Technical details

To reproduce

  1. Clone and install the project
    git clone https://github.com/elixir-europe/biovalidator.git
    cd biovalidator
    npm install
  2. Deploy Biovalidator local server:
    node src/biovalidator
  3. Request validation, and in the used schemas, one being way too big. In my case, EGA.cv.instrument_platforms_array.json, with 12.000+ items in enum.
    time curl --data @$file -H "Content-Type: application/json" -X POST http://localhost:3020/validate
  4. Observe how long it takes for the initial fetch, and then how it stops the validation, prompting the following message:
    {"error":"Failed to compile schema: RangeError: Maximum call stack size exceeded"}
    real    0m15.998s
    user    0m0.005s
    sys     0m0.001s

Observed behaviour

Validation crashes and the document is not validated, since the schema is not compiled correctly.

Expected behaviour

The JSON document $file would be validated accordingly.

Additional context

At the terminal where the server is deployed the following error logs appear:

2022-12-06T13:37:31.464Z [info] Compiling new schema, $schemaId: undefined
2022-12-06T13:37:46.912Z [error] Failed to compile schema: RangeError: Maximum call stack size exceeded
2022-12-06T13:37:46.913Z [error] An error occurred while running the validation: {"error":"Failed to compile schema: RangeError: Maximum call stack size exceeded"}
2022-12-06T13:37:46.914Z [error] New validation request: Server failed to process data: {"error":"Failed to compile schema: RangeError: Maximum call stack size exceeded"}
M-casado commented 1 year ago

We ended up adapting our schemas around this issue, removing the long CV lists. For now it's no longer a blocker on our end.