Informasjonsforvaltning / fdk

Felles datakatalog
https://fellesdatakatalog.brreg.no/
Apache License 2.0
11 stars 9 forks source link

Conventional Commits

The National Data Directory (Felles datakatalog)

This repository contains the source code for the National Data Directory of Norway. The work is led by the Brønnøysund Register Centre and the Data Directory was launched November 2017. The Data Directory contains metadata about the datasets that the various Governmental bodies maintain in their data catalogs. We provide a search service that allow users to discover datasets and where they are kept. The content of the data catalog is harvested once a day from several more specific data catalogs including the registration application.
The data catalogs are formatted according to the Norwegian profile DCAT-AP-NO 1.1 of the European profile of W3C's Data Catalog standard.

Three main applications are developed:

  1. A Search Application that allow users to search and browse metadata about the datasets.
  2. A Harvester Application that downloads data catalogs and makes them searchable.
  3. A Registration Application that allow users to register metadata about their datasets.

Norwegian description:

Felles datakatalog gir en oversikt over datasett fra virksomheter i Norge. Løsningen er utviklet av Brønnøysundregistrene i perioden 2016 til desember 2017. Løsningen ble lansert i november 2017. Det er en av flere felleskomponenten som utvikles i regi av Skate som skal bidra til å bedre integrasjon mellom offentlige virksomheter og bedre tjenester. Systemet er basert på en norsk profil DCAT-AP-NO 1.1, av en Europeisk profil av W3C Datakatalog standard for utveksling av datasettbeskrivelser.

Contact

If you have any questions please send them to fellesdatakatalog@brreg.no.

Run application

The search application is available here. The two other applications are only available for registered users. Any questions can be sent to fellesdatakatalog@brreg.no.

The search api can also be used.

Set up your development environment

Prerequisite: Make sure you have local admin on your computer, as gitbash has to be run as an administrator

1) Clone this repo

2) Install Java8, Maven and Docker.

- If you are running Windows. Make sure you manually add the correct Maven path in windows "environment Variables"
- Also make sure you have set correct JAVA_HOME path to environment variables.
- After having installed Docker. Make sure you update the resource limits at Settings-Advanced. You need at least 4 CPU's and       more than 8k MB of Memory.

- If you have a Mac, running this script will install Java8 and Maven automatically: 

    ```
    ./install-dependencies-mac.sh
    ```

3) Configure .envrc based on .envrc.template. Optionally install direnv to lock the variables to the main working directory

4) Compile, create docker images and run the entire project:

If you are running windows, you also need to make sure you have installed node.js:
https://nodejs.org/en/download/

```
./runAll.sh
```  

If you only want to recompile one module ("search-api" in this example), use the following:     

```
./runDocker.sh search-api
```

 Frontend applications such as search and registration-react are built and run the following way:

 ```
 docker-compose up -d --build registration-react
 ```

5) If images are already built, project can be run:

```
docker-compose up -d
```

  Restart a specific module  after image rebuild,.

```
docker-compose up -d registration
```

Monitor logs 

```
docker-compose logs -f registration
```

6) Open solution

  Search site: [http://localhost:8080](http://localhost:8080)

  Registration site: [https://localhost:8098](https://localhost:8098)

Run browser-based end-to-end tests

In order to have maintainable tests, the tests must equally well run in all environment configurationds:

1) Brower in host machine (windowed+headless), services in docker-compose

Make sure chromium is installed (for mac, TODO windows)
```
brew cask install chromium
```

Make sure services are running in docker-compose network and exposed to localhost 
(beware of port conflicts with services running in intelliJ

```
./runAll.sh
# or
docker-compose up -d
```

Ensure dependencies are installed
```
(cd applications/e2e ; npm i)
```

Run tests    

```
# run tests in chromim headless (no window, just report) 
(cd applications/e2e ; npm t)

# run tests in chromium window opened
(cd applications/e2e ; npm run test:browser)

```

2) Browser in container (headless), services in docker-compose

```
# run
docker-compose run e2e npm run test:in_container

# build container (if changes in tests)
docker-compose build e2e 

```

Release

Generate release notes and create release in GitHub:

git checkout develop
git pull
npm run release
git push --follow-tags origin

Modules

Architecture

The Registration Application consists of the following main modules:

The Search Application consists of the following modules

The Harvester Application consist of the following modules

Common Services

External Integrations

Start individual applications

There is a couple of scripts that automates build and run the various models ondocker. The scripts are:

Search application:

docker-compose up -d search

This starts DCAT repositories, fuseki and elasticsearch, as well as the search-api service. To access the search application start a browser on http://localhost:8080. Be aware that there is no data registered in the repositories (see the harvester application)

Harvester application:

docker-compose up -d harvester

This starts the harvester application with the corresponding harvester-api.

Registration application:

docker-compose up -d registration

This starts the registration application with corresponding api services. The application can be accessed on http://localhost:8099 The regstration application requires authentication. The following test-user identifiers can be used: (03096000854, 01066800187, 23076102252)

Shut down all containers:

docker-compose down

Run end2end tests (java)

In IntelliJ, select module applications/end2end-test and click "run tests"

Storage

The repository is stored in a persistent volume, see data/esdata5 for elasticsearch repository and data/fuseki for the fuseki repository.

Indexes in elasticsearch

Common Docker Problems

Some times docker can be a bit overworked and one might need to clean up.

Solution: remove old containers

bash: docker rm -f $(docker ps -aq)

Remove old images

bash: docker rmi -f $(docker images -q)

Docker is slow on mac: Docker needs at least 8G of memory

Docker -> Preferences -> Advanced -> Change memory to (8 GiB)

Common ElasticSearch Problems

Error message: java.nio.file.AccessDeniedException: /usr/share/elasticsearch/data/nodes On windows platforms, this seems to be caused by some issue with credentials.

Solution - reset and reeenter the credentials: Rightclick docker->Settings->Shared Drives->Reset Credentials. Reselect the drive you want shared, and reenter credentials, and do a docker-compose stop elasticsearch5 and a docker-compose up -d