the-fab-cube / flesh-and-blood-cards

Open source JSON/CSV representations of the cards for the Flesh and Blood TCG
95 stars 34 forks source link

Speed up validate-json.sh script #417

Open luceleaftea opened 3 months ago

luceleaftea commented 3 months ago

The validate-json.sh script is slow, and can likely be sped up by parallelizing the various subscripts it runs.

RRunner1337 commented 1 month ago

There is a simple solution that will sheer the time of processing probably in half.

The pajv library can process multiple files that MUST confirm to a one schema file. This will not paralize but actually shorten the number of calls to the pajv library.

The library can pass globb patterns to the second -d argument of the call. Hence changing the helper-scripts/json-validation/validate-json.sh to:

#!/bin/bash

[ -x ./node_modules/pajv/index.js ] || npm i

function validate_json {
    ./node_modules/pajv/index.js validate -s $1 -d $2  || exit $?
}

CMD="./node_modules/pajv/index.js validate -s"

SECONDS=0

validate_json ../../json-schema/ability-schema.json ../../json/*/ability.json
validate_json ../../json-schema/art-variation-schema.json ../../json/*/art-variation.json
validate_json ../../json-schema/artist-schema.json ../../json/*/artist.json
validate_json ../../json-schema/card-schema.json ../../json/*/card.json
validate_json ../../json-schema/card-flattened-schema.json ../../json/*/card-flattened.json
validate_json ../../json-schema/card-face-association-schema.json ../../json/*/card-face-association.json
validate_json ../../json-schema/card-reference-schema.json ../../json/*/card-reference.json
validate_json ../../json-schema/edition-schema.json ../../json/*/edition.json
validate_json ../../json-schema/foiling-schema.json ../../json/*/foiling.json
validate_json ../../json-schema/icon-schema.json ../../json/*/icon.json
validate_json ../../json-schema/keyword-schema.json ../../json/*/keyword.json
validate_json ../../json-schema/legality-schema.json "../../json/*/@(banned|living-legend|suspended|restricted)-*.json"
#validate_json ../../json-schema/legality-schema.json ../../json/english/banned-cc.json
#validate_json ../../json-schema/legality-schema.json ../../json/english/banned-commoner.json
#validate_json ../../json-schema/legality-schema.json ../../json/english/banned-upf.json
#validate_json ../../json-schema/legality-schema.json ../../json/english/living-legend-blitz.json
#validate_json ../../json-schema/legality-schema.json ../../json/english/living-legend-cc.json
#validate_json ../../json-schema/legality-schema.json ../../json/english/suspended-blitz.json
#validate_json ../../json-schema/legality-schema.json ../../json/english/suspended-cc.json
#validate_json ../../json-schema/legality-schema.json ../../json/english/suspended-commoner.json
#validate_json ../../json-schema/legality-schema.json ../../json/english/restricted-ll.json
validate_json ../../json-schema/rarity-schema.json ../../json/*/rarity.json
validate_json ../../json-schema/set-schema.json ../../json/*/set.json
validate_json ../../json-schema/type-schema.json ../../json/*/type.json

#validate_json ../../json-schema/ability-schema.json ../../json/french/ability.json
#validate_json ../../json-schema/artist-schema.json ../../json/french/artist.json
#validate_json ../../json-schema/card-schema.json ../../json/french/card.json
#validate_json ../../json-schema/card-flattened-schema.json ../../json/french/card-flattened.json
#validate_json ../../json-schema/keyword-schema.json ../../json/french/keyword.json
#validate_json ../../json-schema/set-schema.json ../../json/french/set.json
#validate_json ../../json-schema/type-schema.json ../../json/french/type.json

#validate_json ../../json-schema/ability-schema.json ../../json/german/ability.json
#validate_json ../../json-schema/artist-schema.json ../../json/german/artist.json
#validate_json ../../json-schema/card-schema.json ../../json/german/card.json
#validate_json ../../json-schema/card-flattened-schema.json ../../json/german/card-flattened.json
#validate_json ../../json-schema/keyword-schema.json ../../json/german/keyword.json
#validate_json ../../json-schema/set-schema.json ../../json/german/set.json
#validate_json ../../json-schema/type-schema.json ../../json/german/type.json

#validate_json ../../json-schema/ability-schema.json ../../json/italian/ability.json
#validate_json ../../json-schema/artist-schema.json ../../json/italian/artist.json
#validate_json ../../json-schema/card-schema.json ../../json/italian/card.json
#validate_json ../../json-schema/card-flattened-schema.json ../../json/italian/card-flattened.json
#validate_json ../../json-schema/keyword-schema.json ../../json/italian/keyword.json
#validate_json ../../json-schema/set-schema.json ../../json/italian/set.json
#validate_json ../../json-schema/type-schema.json ../../json/italian/type.json

#validate_json ../../json-schema/ability-schema.json ../../json/spanish/ability.json
#validate_json ../../json-schema/artist-schema.json ../../json/spanish/artist.json
#validate_json ../../json-schema/card-schema.json ../../json/spanish/card.json
#validate_json ../../json-schema/card-flattened-schema.json ../../json/spanish/card-flattened.json
#validate_json ../../json-schema/keyword-schema.json ../../json/spanish/keyword.json
#validate_json ../../json-schema/set-schema.json ../../json/spanish/set.json
#validate_json ../../json-schema/type-schema.json ../../json/spanish/type.json

echo "JSON validation took: $SECONDS seconds"

Should help considerably. Further optimization could be made by having a specialized Docker nodejs image for executing JSON validation - for example built like this:

FROM node:16.4.2-alpine as base

WORKDIR /usr/local/app

COPY package.json /usr/local/app/package.json
COPY package-lock.json /usr/local/app/package-lock.json

RUN npm i

ENTRYPOINT [ "./node_modules/pajv/index.js" ]

which should reside in helper-scripts/json-validation/Dockerfile and be bult using:

docker build --no-cache -f Dockerfile -t <docker-image-name> .

from the same directory.

The image would then be used instead of locally installing the development tools needed and run the validation through command line (maybe even fixing the issue #407 in the process):

docker run -rm -i -v "./:/data" <docker-image-name-from-build-step> validate -s <path-to-schema-file-in-docker> -d <glob-to-json-files-in-docker>

validate-json.sh script would need to be adopted for validation to be running from Docker volume (different paths, ...)

In my local environment I managed to optimize the validation time from 59 seconds to 19 seconds using the script that used local docker image built from provided dockerfile script, so I would assume that approximatelly 60% reduction in validation time is an accurate assumption.