tenders-exposed / elvis-backend-node

Search and visualize public procurements for EU countries http://tenders.exposed/
MIT License
5 stars 2 forks source link

elvis-backend-node

Coverage Status

Search and visualize public procurements for EU countries http://tenders.exposed/

Previously, tenders.exposed was powered by tenders-exposed/elvis-backend but we decided to rewrite it completely because:

We chose Node.js because:

Using the API

Check out the API documentation made with Swagger:

https://api.tenders.exposed/docs

Contributing

  1. Download OrientDB 2.2.30:

    docker run --name orientdb -p 2424:2424 -p 2480:2480 -e ORIENTDB_ROOT_PASSWORD={yourRootPass} orientdb:2.2.30

  2. Create databases:

    docker exec -it orientdb /orientdb/bin/console.sh

    In the ODB console:

    CREATE DATABASE remote:localhost/{yourDBName} root {yourRootPass} plocal graph

    And a test db, preferably in memory:

    CREATE DATABASE remote:localhost/{yourTestingDBName} root {yourRootPass} memory graph

  3. Clone this repo:

    git clone https://github.com/tenders-exposed/elvis-backend-node.git

  4. Configure environment variables:

    cd elvis-backend-node

    In the root of the project make a new file called .env from the .env.example file:

    cp .env.example .env

    Edit .env with your settings. If you went with the ODB defaults like above, it will look like this:

    ORIENTDB_HOST=localhost
    ORIENTDB_PORT=2424
    ORIENTDB_DB={yourDBName}
    # Admin is the default ODB user
    ORIENTDB_USER=admin
    ORIENTDB_PASS=admin
    ORIENTDB_TEST_DB={yourTestingDBName}
  5. Install dependencies:

    npm install

  6. Create the database schema for the dev db:

    The test db is migrated automatically before every test.

    npm run migrate

  7. Open OrientDB Studio in a browser at http://localhost:2480/studio/index.html to see if the database contains the schema we migrated

  8. Run the tests with:

    npm run test

  9. Run the linter with:

    npm run lint

  10. Install OrientJS globally to get access to their CLI. For example to create a new migration:

    orientjs -h localhost -p 2424 -n elvis -U admin -P admin migrate create {newMigrationName}

Deploy

  1. Pull latest changes.

  2. Update configuration in .env based on .env.example if necessary.

  3. Build the containers:

    ORIENTDB_ROOT_PASSWORD={password} docker-compose build elvis_api

  4. Start the containers:

    docker-compose up --no-deps -d elvis_api

    If this is the first deploy run:

    docker-compose up -d elvis_api

  5. Migrate:

    docker-compose run --name=migrate --rm elvis_api npm run migrate

Import data

The amount of data we have is overwhelming for a single Node process. Not only does the import take long but it reaches Heap out of memory error even with up to 15GB of RAM.

To speed things up and avoid overwhelming an individual process we are now running a Node process for each file instead of passing multile files to the same process. To achieve this we make a docker container to import each file and we orchestrate the containers with GNU parallel:

find /folder/with/data/files -iname '*.json' -printf "%f\n" | \
parallel --progress -I"{}" -j5 \
docker-compose run --name="elvis_import_"{} --rm elvis_api \
node --max-old-space-size=4096 ./scripts/import_data.js -c 1000 -r 1 /rawdata/data/exported_by_country/{}

With -j5 we are telling parallel to process 5 containers at once. The option --max-old-space-size=4096 allows the node process up to use up to 4GB of RAM. The import scripts also takes options:

We also have to import static data for countries:

docker-compose run --name=import_countries --rm elvis_api node ./scripts/import_countries.js

and CPVs:

docker-compose run --name=import_cpvs --rm elvis_api node ./scripts/import_cpvs.js