SINTEF / olmo

SINTEF OceanLab Observatory data handling and storage
3 stars 4 forks source link

OLMO

OceanLab Observatory, Automated Data Handling

Data access

Graphical portal (grafana)

The starting page for exploring the data graphically is the data portal here. This is the same site that you will reach if you start at our homepage and click on the 'Data Portal' link. On that page you find information about or links to:

API access / download from python script

Given user credentials you can write queries directly to the database. This is done using the 'flux' query language. See a getting started here.

We reccomend using python to pass the flux query to the influx https endpoint. There is an example script you can work from in this repository here:

examples/api_examples.py

To run this script you will need some libraries, and to set up the environment. This can be done via:

If you are unsure which data we have available, and in which 'tables' in the DB that data is found you can either look here for a full list of tables available. Or you can search using flux, for example in examples/api_examples.py there is a query that returns a list of all 'tables' written to in the last six hours that have a 'field key' (data column) as latitude.

Data collection

Munkholmen sensor data

Each sensor on munkholmen should have a class object associated with it. See for example adcp.py. The rsync_and_ingest() method of this class should be run every 2 mins via the ingest_munkholmen.py script.

In the init of this class the following variables will be used to rsync and ingest the data. In cases where there is an _L#, there can be 4 different versions of this variable for the 4 data quality levels.

Munkholmen operation (LoggerNet) data

The files are transferred to the LoggerNet pc over loggernet. See Loggernet Windows machine. From there we have a cronjob that runs ingest_loggernet.py. This transfers over all files but the latest one, and ingests them into influx. For more info see the file ingest_loggernet.py and the function sensor_conversions.ingest_loggernet_file().

Uploading custom data

Currently we simply support this through uploading directories. So if you have a single file to be linked in with the data, just put it in a directory.

This works by filling in all the necessary fields of metadata (and folder location) before running the python file to upload the data.

Note that the python file must be run from a computer which has the "az" command line tool installed, and there needs to be a file called azure_token_datalake with a valid access token in the directory above this repo. See Torfinn2 for an example.

To generate the access token:

Note that the current access token on Torfinn2 expires at the start of 2023.

Loggernet Windows machine:

This has been installed under user Loggernet_user on the machine sintefutv012. Contact William if you need to access this user.

On the machine we have installed OpenSSH-Win64. This needs to be started up if it stops running. You can do this via:

  1. Open a command prompt as administrator.
  2. Start-Service sshd

Note that I also set Set-Service -Name sshd -StartupType 'Automatic', which I hope will start this on start up, but this is yet to be tested.

I have now added LoggerNet to the TaskScheduler, with the trigger that it starts on startup of the machine.

Getting started with notebooks

Step 1: Follow steps 1 to 3 of 'Getting started':

conda env install -f environment.yml

conda activate olmo

python setup.py develop

Step 2 (optional): We also installed some helpful extensions to notebooks, but this needs to be activated within 'jupyter':

jupyter contrib nbextension install --user

Step 3: Finally start the notebook server (jupyter notebook). This will open up a page in your brower with the files in this repo.

Notebooks are found in the Notebooks folder. You will also note there is a tab at the top called Nbextensions. I like to click on that and enable Table of Contents (2).

Front end

We have implemented a grafana front end, and have some data being displayed on the website. These are not currently open resources.

Development

To develop the code, we generally test into a newly created DB. Running python files from your 'personal' user on the controller PC.

Files on the remote computers should not be deleted until testing has verified that the workflow works correctly. This can be done using the variable drop_recent_files_lX, by setting this to false.