enram / data-repository

Data quality assessment
https://enram.github.io/data-repository/
MIT License
3 stars 1 forks source link

Periodically download data from BALTRAD server #1

Closed peterdesmet closed 8 years ago

peterdesmet commented 8 years ago

Do we really need to build this? And if so, how do we keep it simple?

Yes, we at least need access to the data to build anything at all.

Questions

adokter commented 8 years ago

Most likely OPERA will request that data are made available with a delay of two days, to prevent any real-time operations. I therefore think downloading data once a day is sufficient.

peterdesmet commented 8 years ago

@adokter we would like to start setting up a file repository, while @bartaelterman is still working with us. Are there any hdf5 files on BALTRED already that we can test this on?

adokter commented 8 years ago

@peterdesmet @bartaelterman Here is a file repo of vertical profiles of birds, as generated with the latest version of vol2bird. Not all polar volume files on which these profiles were calculated contained the necessary data (see README), so you will find some of the profiles do not contain any data.

Note also the issue that nodetects (in case where radar didn't survey the airspace) and nodatas (when radar did survey the airspace, but quantity could not be determined) are currently both encoded as 'NaN' - that will change in the near future.

peterdesmet commented 8 years ago

Great, so that is something we could currently use as the "source" repo for these kind of data, later to be replaced by the BALTRAD server?

adokter commented 8 years ago

Yes, the baltrad server will generate identical files, only the directory structure remains to be decided. I will be visiting SMHI mid July to set up a prototype data flow.

adokter commented 8 years ago

@peterdesmet @bartaelterman The directory with test files is now in its own repository, so no longer a folder in the develop branch of vol2bird

peterdesmet commented 8 years ago

Thanks, good to know!

peterdesmet commented 8 years ago

@adokter

adokter commented 8 years ago
bartaelterman commented 8 years ago

Here is code that will assist with downloading files from a Github repo. Execute this example script to download all hdf5 files from @adokter 's repo (selecting only files from the vp dir).

@peterdesmet @stijnvanhoey thoughts on the setup?

peterdesmet commented 8 years ago

Code to connect to source data (on GitHub) is now at https://github.com/enram/infrastructure/blob/master/data_repository/connectors.py. Data will be downloaded daily from BALTRAD once in production.