digitalearthpacific / dep-tools

Processing tools for Digital Earth Pacific
MIT License
1 stars 0 forks source link

A raft of changes to support running GeoMAD #20

Closed alexgleith closed 9 months ago

alexgleith commented 10 months ago

This PR changes far too many things, so apologies about that.

And I think there are still a range of improvements and tweaks we can make. But, what we have now is working really well, and I think these changes are all a step in the right direction.

Please got through critically, Jessie, and let's maybe workshop it next week to see if we can find a compromise that we can merge in.

jessjaco commented 10 months ago

Just did a quick review. Things look good. I'm thinking we could combine the 2 odc loaders - either add a param return_type which defines the return type, or have the existing odcloader subclass the flat loader. I prefer to not have duplicated code if we can help it.

Happy to help with that part, when the interface is stable.

alexgleith commented 10 months ago

Yeah, good idea. Should be easy to do.

Then we can parameterise the structure and returned data type, which should provide some good flexibility.

jessjaco commented 9 months ago

I like adding the odc writer. I'm thinking about making a separate mixin for odc vs rioxarray writing (not unlike the structure in the loaders), so as not to have to pass more args through to write_to_blob_storage. Sound good?

alexgleith commented 9 months ago

Yeah, that works I guess.

I think there'll be more code duplication that way though, won't there?

jessjaco commented 9 months ago

I have been working on the single task processing to both fit the standard argo setup and get the logging better, here is what I'm thinking: https://github.com/digitalearthpacific/dep-tools/blob/task/dep_tools/task.py, As seen in #28.

alexgleith commented 9 months ago

Nice! Although that's still not doing the kind of logging I'm after, I'm sure I can create another variant that does.

We almost need a week long workshop to get this tooling working better!

I'm on hold while I wait for our infrastructure to come back online, but when it has and I have done some test runs of the geomad, I'll come back and break this out into a bunch of smaller PRs... and using your proposed task runner sounds good.

jessjaco commented 9 months ago

That's what I was thinking...just make a Task that has whatever logging you want

alexgleith commented 9 months ago

Although that's still not doing the kind of logging I'm after

I have an idea around logging and "have we done this already" checking, which I can implement later.

jessjaco commented 9 months ago

Ok, I made some changes which needed to happen and some to improve clarity. I'm still not totally settled on the loader in particular, but also would like to refactor the write_to_blob_storage function. However, I think these can be best handled piecewise

alexgleith commented 9 months ago

Looks clean! Nice work, especially with the partial function. That's neat.

jessjaco commented 9 months ago

Pending any further input, I'd like to merge this tomorrow and relegate fixes or additional changes to new PRs

alexgleith commented 9 months ago

Happy for you to merge it in.