Closed metasim closed 7 months ago
@lnicola Thank you for the early feedback. Course corrections earlier rather than laters are definitely helpful.
@lnicola Two questions:
dem
feature flag?raster/processing/dem/slope.rs
impl Dataset {
fn slope(&self, options: &SlopeOptions) -> Result<Dataset> { ... }
...
}
raster/processing/dem/aspect.rs
impl Dataset {
fn aspect(&self, options: &AspectOptions) -> Result<Dataset> { ... }
...
}
etc.
I like this approach as it makes it more discoverable in IDE/rust-analyzer.
No to both
Sorry, I was on my phone earlier and didn't pay enough attention to the code snippets (I was looking at the paths).
By the way, if it's in a trait you also get completions for it. Not that I'm proposing this, just wanted to point it out.
- I'm not sure 😬. I disliked it initially, but perhaps it's not such a bad idea. Maybe we should ask for another opinion.
FWIW: We do have precedence for it (and I like the code-organization benefits):
❯ rg 'impl Dataset'
src/dataset.rs
37:impl Dataset {
src/vector/sql.rs
60:impl Dataset {
src/vector/layer.rs
593:impl Dataset {
src/gcp.rs
109:impl Dataset {
src/vector/transaction.rs
100:impl Dataset {
src/raster/rasterband.rs
23:impl Dataset {
src/raster/mdarray.rs
800:impl Dataset {
src/raster/processing/dem/mod.rs
54:impl Dataset {
The organization is fine, which is why I initially said I had nothing against it.
What I'm not sure I like is exposing the algorithms as inherent methods on the datasets. Most of the stuff we have there maps directly to GDALDatasetXXX
. The DEM and warper are separate in the GDAL API.
What I'm not sure I like is exposing the algorithms as inherent methods on the datasets. Most of the stuff we have there maps directly to GDALDatasetXXX. The DEM and warper are separate in the GDAL API.
Aaaaah. Got it. A nuance I didn't catch. It's hard for me to tell what in the C API design is well-considered, and what's accidental.
I flip-flopped between impl Dataset
in each sub-module, vs in the dem
module. Settled on the latter for coherence sake in the generated docs.
Aside...
I wish you could do this in Rust:
pub trait DemRoutines where Self: Dataset { ... }
impl DemRoutines for Dataset {}
Then if you import the module, you get the methods, but not otherwise, and would have half the boilerplate (default impls in trait
, not just declarations).
~I can switch this to a trait DemRoutines
with a single impl
for Dataset
if you think the separation is important.~ Looking ahead to other processing capabilities (warp, contour, etc.), we probably want to get this pattern correct. So I think I'll see what switching to an extension trait looks like.
Update: Change to extension trait.
This looks pretty nice, except:
DemCommonOptionsOwner
thing. I would inline the four fields into the structs; this causes some code duplication, but feels much easier to follow, and might even more robust in the long term (e.g. maybe one of the options gets dropped by one of the apps, so it's no longer "common"). Is DemCommonOptions
something that applications will actually care about?DemCommonOptions::compute_edges
should probably be bool
, not Option<bool>
; same for additional_options
DemProcessing
being a trait is not too bad, but they might as well be inherent methods. It seems like extra complexity for no good reason (cf.). Using free functions (gdal::algorithm::dem::slope(ds, ...)
) be a good idea too, for symmetry reasons: an algorithm could, in principle, have two inputsDemCommonOptions::compute_edges should probably be bool, not Option
; same for additional_options
The reason for compute_edges
being Option
is so we don't have to keep in sync with what the GDAL defaults are, should they change. IOW, If the caller doesn't explicitly set it, we do whatever the GDAl default is. It probably won't change, so I'll just set it to the documented default, but wanted to point out there was some rationale to it.
so we don't have to keep in sync with what the GDAL defaults are
That's what I suspected, but as a user, it makes for a pretty confusing API.
DemProcessing
trait
I'm really sorry for the back and forth on this, but I now think they should be free functions. Some of the reasons are:
Dataset
APIDataset
feels very arbitrary -- why not on Rasterband
?putting them on the Dataset feels very arbitrary -- why not on Rasterband?
Because the primary input to the all the DEM functions is the Dataset.
I'll convert them to free functions, but to me it seems just as arbitrary as any other decisions to make something inherent on dataset. From where I sit, in the C API, if the first parameter is Dataset
, it might as well be treated as the receiver.
We'll also loose discoverability, but maybe that's good for DEM routines. Most people won't use them.
I just wish we had a heuristic we could apply to determine objectively if something should be a free function verses a method. It seems more a matter of taste (and my preference is methods).
The interface I see for Dataset
is basically https://gdal.org/doxygen/classGDALDataset.
I've used GDAL mostly from Python, but when I'm looking for an API, most of the time my process is:
@lnicola I got rid of all the DemCommonOptions
constructs. I somewhat understand your reasoning (but not really, as the user experience is the same), but need to be on the record that I'm not happy with it. Doing so added > 2000 lines of code, which seems like we're violating DRY principles in a big way. Without using traits or macros, I don't know to get rid of all the copy-and-pasting.
https://github.com/georust/gdal/pull/456/commits/ff8a2b614a0c64a2085eedf504116d8df03e44db
Question: Do I need to remove all the situations where the options structs have Option
fields, and explicitly default to what the GDAL documentation says are the internal defaults?
Doing so added > 2000 lines of code, which seems like we're violating DRY principles in a big way. Without using traits or macros, I don't know to get rid of all the copy-and-pasting.
The third way is include!
, but a declarative macro is what I would use here.
remove all the situations where the options structs
I'll have to check. Let's start with the common options (that bool and the option string list which can be empty) and iterate in another PR if really needed.
And one final question is where to put these. They're currently under raster::processing::dem
, but we also have programs::raster
. I think I like programs
more, since it's a little weird to have raster::processing::translate
and vector::processing::vector_translate
.
Another aspect is that I think the Rust guidelines encourage tall/deep module hierarchies, but with re-exports so that the user only sees shallow ones.
And one final question is where to put these. They're currently under
raster::processing::dem
, but we also haveprograms::raster
. I think I likeprograms
more, since it's a little weird to haveraster::processing::translate
andvector::processing::vector_translate
.
The rationale for this is two fold:
raster::processing:warp
. But that's separate from the export structure. See below.programs
makes any sense in a library. If I'm looking at a crate structure, I'd expect programs
be the same as bins
: a place to put source for executable targets. From the outside I wouldn't know how to semantically interpret programs
. Heck, even from the inside I don't know how to interpret programs
. Long term, I think the contents of the programs
module should be migrated to something more logical.On your point about raster::processing
vs programs::raster
, I'd argue the latter breaks the convention in the main library by reversing scope. We train users to think "domain first, capability second", but with programs::raster
it's "if I don't find it in raster::
, I should also look in programs::raster
. But because I find programs
confusing, perhaps I'm missing a larger point.
I can see arguments against raster::processing
, but to fix its weaknesses I think we need to think more holistically about module structure and what heuristics we use to determine the proper destination for a capability. If the answer is "do exactly what is done in C++" (which would be valid) then we have even more of a gap to address. But I think we can do better than that. To me, raster::processing
is a place for advanced capabilities beyond opening and querying Dataset
s.
Another aspect is that I think the Rust guidelines encourage tall/deep module hierarchies, but with re-exports so that the user only sees shallow ones.
My intent was the use would use gdal::raster::processing::dem::*
, since DEM stuff is kinda niche. But I see an argument that this is too deep. I don't really think it should be exported at the gdal::raster::*
level. Again, programs::raster
doesn't makes sense to me as "what's a 'program' here?".
How about this: we close the PR as it is (or tweak the pub use
structure), and open a new ticket to discuss a design around the organization of all constituent parts?
I don't think programs makes any sense in a library. If I'm looking at a crate structure, I'd expect programs be the same as bins: a place to put source for executable targets. From the outside I wouldn't know how to semantically interpret programs. Heck, even from the inside I don't know how to interpret programs. Long term, I think the contents of the programs module should be migrated to something more logical.
I agree it's unusual, but many of the GDAL "apps" are already exposed as a library (everything that ends up taking an argv
as an input, including GDALDEMProcessing
. I'm not that familiar with them, but I think they're split between gdal_alg.h
and gdal_utils.h
in the GDAL headers.
But because I find programs confusing, perhaps I'm missing a larger point.
My thinking is "I want gdalwarp
, that's a "program". I want (to give a random example) ExecuteSQL
, that's not a program". Even today, raster
is a bit of a kitchen sink: https://docs.rs/gdal/latest/gdal/raster/index.html.
How about this: we close the PR as it is (or tweak the pub use structure), and open a new ticket to discuss a design around the organization of all constituent parts?
That will never get anywhere, so please no :stuck_out_tongue:.
"I want
gdalwarp
, that's a "program".
gdalwarp
barely exposes the potential of the Warp subsystem. After reading through a lot of the GDAL Warp code over the last year, I think it's potential is barely tapped by the gdalwarp
executable, and limited by the lack of examples and complex options model. It may be wishful/idealistic thinking on my part, but with Rust, we could expose a lot of the processing potential in a much less intimidating way than it it is in C++. Rust enabled pixel functions alone could be very powerful. That's my dream anyway.
bors r=lnicola
[X] I agree to follow the project's code of conduct.
[x] I added an entry to
CHANGES.md
if knowledge of this change could be valuable to users.[x] Add
hillshade
[x] Add
color-relief
[x] Wait for and merge https://github.com/georust/gdal/pull/457
[x] Remove
DemCommonOptionsOwner
and inline the fields into the various structs.[x] Remove
DemCommonOptions
[x] Remove
Option
from aroundDemCommonOptions::compute_edges
andadditional_options
.[x] ~Convert
DemProcessing
methods to inherent methods onDataset
.~[x] Convert
DemProcessing
methods to free functions.