Closed bartaelterman closed 9 years ago
I noticed a problem. When offloading data, VUE generates VRL files and CSV files. However, the format of the CSV can be different depending on the settings. As such, CSV files from the INBO field laptop have different headings compared to those obtained by the field laptop of VLIZ... For example, one contains the disjunct code_space and ID, while the other merges them in 1 column (Transmitter). I could send you two files as an example. Is this a big issue? Otherways we could still opting for the VRL export to CSV (extra VUE step)? In addition, at the moment there are no T-and/or P-sensors in the tags. However, in the future this will be the case, so additional information (i.e. extra columns) will be in the CSV files. I assume this is not a problem, but just to let you in case you need to take this into account.
@PieterjanVerhelst, can you send a CSV and VRL export from a VLIZ and INBO receiver to Bart (bart.aelterman@inbo.be)? Will add it to this private repository as reference. I fear that the VRL export - although much more stable - cannot be used, as the data is encoded.
I send him the files.
The two files are indeed considerably different. Even the labels used in both files differ (e.g. "Date/Time" vs "Date and Time (UTC)"). There is no other way then to write two separate parsers. The script will estimate the file format (based on the column headers) and then choose the right parser. It's a bit of extra work, but doable.
The other thing you mention is another problem: additional file types to be expected in the future. The idea of the script was to concatenate all the files in the folder. And since the old files will still be there, that means the script will need to be able to distinguish probably 4 file types (INBO, VLIZ, INBO-new, VLIZ-new). Still doable, but depending on how many other file types we can expect later on, this solution might not hold very long.
As for now, there no reason to immediately change strategy. We need to be able to read 2 file formats, so let's do that. We'll just need to bare in mind that additional file types will cause some additional work on the script and we'll see for how long that remains feasible.
I documented the two input formats based on the examples Pieter gave. You can find them here.
Can you review that?
Looks good. As sensors are already taken into account (still blank), I don't think this will cause any problems. Indeed, at the moment only these two formats are used, so just apply the script to them. Just to know how the script works: in case for the VLIZ format, the script will select this format on basis of an 'IF Transmitter AND Sensor Value is TRUE, THEN VLIZ-format' procedure?
Currently, it checks the header and if the first field is Date and Time (UTC)
it's VLIZ format. If it is Date/Time
it's INBO format. But there are other ways and if something else would be more robust, let me know.
The Sensor value
is empty in the example VLIZ file I have.
Sensor values are empty as the tags currently used contain no sensors. In the future, tags with sensors will be applied.
In order to be able to create a script that will consolidate the raw data files, I need some documentation describing the columns in the raw data files. I found an example file containing 3 columns:
transmitter
,receiver
,date_time
.@LifeWatchINBO/fish-tracking can someone what columns I can expect in the raw data files?