enram / getRad

R package to access and standardize radar data
https://enram.github.io/getRad/
Other
0 stars 0 forks source link

Extend `get_vpts()` to provide access to the RMI CROW dataset #13

Open peterdesmet opened 4 days ago

peterdesmet commented 4 days ago

Source

I suggest the value rmi for the parameter source: rmi. It is the only VPTS dataset by RMI. I think the alternative value crow would be confusing as a name, since that is also used for the visualization.

Scope

Metadata and context can be found here. The dataset covers 10 radars and has data since 2019. More data are added daily.

Data files

Data files are deposited at https://opendata.meteo.be/ftp/observations/radar/vbird/ and organized in radar and year directories. The file names are of the format <radar>_vpts_<yyyymmdd>.txt (e.g. behel_vpts_20191015.txt)

Data format

The data format is the default stdout of vol2bird, which is fixed width (example). If you write a parser for that format, I would call it vol2bird_vpts, not rmi_vpts. The CROW visualization has a minimal parser

The format unfortunately does not contain all columns of VPTS CSV. Below is a suggestion how it could be completed.

PietrH commented 3 days ago

Looks fun to work on, I'm looking forward to it. I already explored data.table for reading fixed with files (fwf) earlier this week: https://gist.github.com/PietrH/f13fb98f95b37242e59c92407fde1917

There is also readr::read_fwf(), it offers more control but requires more setup.

I'm slightly tempted to explore arrow::open_dataset() as the radar/year partitioning might actually come in handy.

There is both an FTP as well as a HTTP endpoint, for now I'll probably prefer using the HTTP endpoint.

Questions

  1. Is the metadata header always the same length?
  2. Are the columns always the same width and order?
bart1 commented 3 days ago

arrow actually looks quite cool I need to look into that!

peterdesmet commented 3 days ago
  1. Is the metadata header always the same length?

Do you mean: the same amount of rows? Not sure, but they should always start with #, which can be ignored with readr:: read_fwf(comment = "#")

  1. Are the columns always the same width and order?

Yes

PietrH commented 3 days ago
  1. Is the metadata header always the same length?

Do you mean: the same amount of rows? Not sure, but they should always start with #, which can be ignored with readr:: read_fwf(comment = "#")

Sadly the header is also commented out, but if we are very certain the columns never change, this shouldn't be an issue.