eurec4a / flight-phase-separation

Collection of manually edited flight segments for all platforms participating in EUREC4A.
0 stars 6 forks source link

Add instrument specific ATR flight-phase separation files #32

Open observingClouds opened 10 months ago

observingClouds commented 10 months ago

Hi,

Over in https://github.com/eurec4a/eurec4a-intake/pull/158 the question was raised if it would be feasible to add the ATR metadata files that have been created for the isotope record to this flight-segmentation repository as they closely match its syntax.

An example file looks like

name: RF03 
mission: EUREC4A
flight_id: ATR-0126 
contacts:
- name: Author
  email: author@mail.com
date: 2020-01-26 
flight_report: https://observations.ipsl.fr/aeris/eurec4a-data/REPORTS/ATR-42/2020/20200126/ATR-0126.pdf 
takeoff: 2020-01-26 11:34:57 
landing: 2020-01-26 16:08:07 
events: []
remarks: 'synchronisation problem of Picarro with ATR core data, shifting during flight'
segments:
- kind: dry event
  name: D1
  start: 2020-01-26 15:46:00
  end: 2020-01-26 16:04:00

What would be needed to make these files available through the flight-segmentation API? Should the files be copied into this repo or could they be accessed from its current published location at https://observations.ipsl.fr/aeris/eurec4a-data/AIRCRAFT/ATR/vapour/ ?

observingClouds commented 10 months ago

@d70-t @RobertPincus @Smpljack

observingClouds commented 10 months ago

There are also additional files around the turbulence measurements at https://observations.ipsl.fr/aeris/eurec4a-data/AIRCRAFT/ATR/SAFIRE-TURB/PROCESSED/YAML/v1.9/

d70-t commented 10 months ago

Definitely those files should be integrated into the flight segmentation repo!

To keep the repo useful, we however have to ensure that the format is consistent between the platforms. Issues I currently see are

The multiple files per flight structure seems to encode different kinds of segments (e.g. there are legs with kind "Just above Cld base", in longlegs and longestlegs files). Likely there's a need to differentiate those when using the metadata. In the schema we've intended back when creating the flight segmentation files for HALO and P3, this was the reason why we introduced kinds as a sequence: that way, we can encode e.g. a segment as ["Just above Cld base", "longestleg"], such that users can later on filter on either or both of these.

I guess if we convert the keys and datatypes as outlined above, it should be rather simple to join all files of a flight into one and we are good to go.

Another question might be, if we have to support older code reading these files. If that would be the case, I'd suggest to create a different generated output (in addition to all_flights.yaml), which would correspond more to the current state of the ATR files, but for all aircraft (if possible).

d70-t commented 10 months ago

There's another issue. It seems like some flight ids (e.g. ATR-0126) have been assigned to multiple flights. It's understandable where this came from (the naming scheme and both flights were on the same day), but it's incompatible with the data model of this repository (there can only be one flight per flight id). Also, the whole purpose of the flight id is to uniquely identify the flight... Thus we need to find a way of making these flight ids unique.

observingClouds commented 6 months ago

There's another issue. It seems like some flight ids (e.g. ATR-0126) have been assigned to multiple flights. It's understandable where this came from (the naming scheme and both flights were on the same day), but it's incompatible with the data model of this repository (there can only be one flight per flight id). Also, the whole purpose of the flight id is to uniquely identify the flight... Thus we need to find a way of making these flight ids unique.

For future reference: In https://github.com/eurec4a/eurec4a-intake/pull/158 this issue also occurred and the flights got a continuous research flight id.