c-proof / pyglider

glider software
https://pyglider.readthedocs.io/
Apache License 2.0
17 stars 27 forks source link

start refactor of get_profiles #124

Closed callumrollo closed 2 months ago

callumrollo commented 2 years ago

This will resolve Issue #111

Failing a few tests atm

callumrollo commented 1 year ago

@jklymak is this what you had in mind to resolve #111? get_profiles is now a separate function that works on the timeseries netCDF created by raw_to_timeseries. This should enable more flexible definition of profile_index before splitting into profiles/gridding.

Recently alseamar changed the way that dives are defined on the seaexplorer to a much more sensible method. It now starts a new dive file after the first GPS fix at surface, so we should be able to use a much simpler method to define profiles. Having factored out get_profiles this can be written as a seperate function for end users to call.

codecov[bot] commented 1 year ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 57.60%. Comparing base (179b508) to head (7855e9f). Report is 1 commits behind head on main.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #124 +/- ## ========================================== + Coverage 56.50% 57.60% +1.10% ========================================== Files 9 9 Lines 1646 1630 -16 ========================================== + Hits 930 939 +9 + Misses 716 691 -25 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

callumrollo commented 2 months ago

@jklymak I've merged in the latest changes to main and fixed this PR for factoring out get_profiles, if that's still a goal.

The commit history is a bit of a mess after getting stale, so likely better to squash and merge

callumrollo commented 2 months ago

Currently refactoring out requires operating on the .nc file that was written to disk by seaexplorer.raw_to_L0timeseries. I think it would be faster and more elegant to instead work with xarray.Dataset objects, so that functions like get_profiles would take datasets as input and return them as output, rather than needing to read and write .nc files

jklymak commented 2 months ago

I've not looked at this carefully yet, but 1) it should remain back-compatible so the raw-to-timeseries methods could still make the profile extraction work. 2) I think the profile finding already is a separate method operating on xarrays? So the to-do would be to make that function easier to access, make/lightly deprecate doing it in the raw-to-timeseries, and change the examples to use the external interface.

callumrollo commented 2 months ago

I think this will require a bit more planning, especially as any change to this will need end users to update their processing scripts. I'll close this PR for now and design something that is more robust. I'm wondering if a class based approach might work better