eurec4a / flight-phase-separation

Collection of manually edited flight segments for all platforms participating in EUREC4A.
0 stars 6 forks source link

report.py needs generalization #3

Open RobertPincus opened 4 years ago

RobertPincus commented 4 years ago

scripts/report.py was written specifically for HALO and needs generalization for other platforms.

leosaffin commented 4 years ago

I'm currently looking into this for the twinotter. I'm going to write some tests for this. I was thinking that the test can run report.py for each different platform using data from an example flight+yaml file. So I don't mess up the original functionality when doing this for the twinotter, could someone point me to a good choice for the HALO data to use for this? @d70-t ?

RobertPincus commented 4 years ago

@LSaffin - It would be useful to clarify what the report is for. @d70-t mentioned that it had been used primarily to help debug segment assignment, but not all the tests would be applicable to all platforms.

Perhaps tangential, but @d70-t and I discussed the idea of lightweight validation and reporting at pushes and/or pull requests (#2), and running report.py less frequently - say when producing releases.

leosaffin commented 4 years ago

@RobertPincus - I would say that report.py is for generating easy to look at plots for manually refining the .yaml files after they have been initially generated. Although I wouldn't need the information about sondes and lining the circles up, the default plot with the track, altitude, roll, pitch, and heading as well as the zoomed versions of these plots will be useful for refining the twinotter segment times.

The tests I'm thinking about writing would just be for checking that report.py actually runs and produces an html for an individual flight for each platform. Just to check I don't break it. This would be the kind of thing that could be run on pull requests (although I don't know how to set that part up)

d70-t commented 4 years ago

Yes, in the current state, the report.py was really meant to check the YAML files and to make adjustments down to second accuracy. Evolving that into something which is more general (in terms of supported platforms) and more specific (in terms of producing purposely built reports) is very useful.

The report currently does two separate things:

I would try to separate these two aspects a bit more, such that the formal validation part could also be run using a separate script which would return textual output, but can also be embedded into rich (e.g. HTML-) reports, like it is now. I also would evolve the reporting tool such that it could support multiple reporting profiles, such that it is possible to create different reports with different levels of detail.

I'd work on the separation thing now in order to enable #4

Running the full reports would also be nice, but we've to figure out how to get the track data into CI. I think I'll have to look a bit closer into Aeris/Opendap/zarr/intake for that.

RobertPincus commented 4 years ago

@d70-t I can help with getting track data into CI via OpenDAP etc.

leosaffin commented 4 years ago

@d70-t - Does your comment mean you are actively modifying report.py right now? If yes, I'll hold off on making changes.

I think the main limitation to running it with other flight data is the specific assumptions about data layout and the naming conventions for HALO data being spread among the various functions. I was going to add something like class Platform with a subclass for each one. Then the functions just use e.g. platform.lon. Then each of us could make the subclass for our respective platforms.

d70-t commented 4 years ago

@LSaffin I did not start yet modifying the files. I see what you mean, but I'd argue agains creating new platform classes. The code is already using quite some amount of xarray, so I'd propose to use that as well. Could we have platform specific get_navigation_data - functions, which would accept the flight-id as an argument and return an xarray dataset which is guaranteed to have lat / lon / etc... variables? The function could request the data via opendap and the like and could do platform specific variable renaming if necessary.

What I planned to do was more to move out the SegmentChecker into a separate file and start with the validation script. But maybe it is still a bit too interwoven.