Closed streetCoderr closed 2 years ago
Great investigation so far @streetCoderr ! I think there's been a little confusion though, we shouldn't have to generate the bounding box, we just need to parse it from the GeoJSON document. We could provided some sanity-checking functions to ensure the bounding box meets the RFC's requirements though, but perhaps in a follow-up PR.
And I think this is where the confusion is coming from. The remark in the RFC:
with all axes of the most southwesterly point followed by all axes of the more northeasterly point.
must be interpreted in the context of a bounding box. Consider this poorly drawn 2D geometry and box.
In the context of some coordinate system, a box (in however many dimensions) has some notion of "bottom-left" and "top-right" points. In 2D it is easier to get a feel for that.
My approach was to pick the the most westerly coordinate between both and combine with the most southerly coordinate to make up a point; in this case, [-120,-77]. Is this the expected southwesterly point. I feel so though.
Yes exactly, if you take your example as a bounding box (in 2D) then the four corners are:
[[-120,-20],[-80,-20],[-80,-77],[-120,-77],[-120,-20]]
At which point the southwesterly and northwesterly points can be selected.
So the way I was thinking this could be implemented was by extending geometry, feature and featurecollection to support a bbox field.
which should maybe now become a record:
type t = {
bbox : float array option;
geometry : Geometry.t option;
properties : Json.t option;
}
Or something like that. And same idea applied to:
Does that make sense?
Great investigation so far @streetCoderr ! I think there's been a little confusion though, we shouldn't have to generate the bounding box, we just need to parse it from the GeoJSON document. We could provided some sanity-checking functions to ensure the bounding box meets the RFC's requirements though, but perhaps in a follow-up PR.
And I think this is where the confusion is coming from. The remark in the RFC:
with all axes of the most southwesterly point followed by all axes of the more northeasterly point.
must be interpreted in the context of a bounding box. Consider this poorly drawn 2D geometry and box.
In the context of some coordinate system, a box (in however many dimensions) has some notion of "bottom-left" and "top-right" points. In 2D it is easier to get a feel for that.
My approach was to pick the the most westerly coordinate between both and combine with the most southerly coordinate to make up a point; in this case, [-120,-77]. Is this the expected southwesterly point. I feel so though.
Yes exactly, if you take your example as a bounding box (in 2D) then the four corners are:
[[-120,-20],[-80,-20],[-80,-77],[-120,-77],[-120,-20]]
At which point the southwesterly and northwesterly points can be selected.
So the way I was thinking this could be implemented was by extending geometry, feature and featurecollection to support a bbox field.
which should maybe now become a record:
type t = { bbox : float array option; geometry : Geometry.t option; properties : Json.t option; }
Or something like that. And same idea applied to:
Does that make sense?
Wow. I love the way you explained this. It all make sense now. Thank you so much.
I will work on it now.
Just thought I'd leave this comment for future try-catching
Noted. Thank you :))
Hi @patricoferris , good afternoon :)) It seems like Geometry does not support the bbox field. Only Feature and FeatureCollection does. Am I wrong?
I think all three support it because they are all GeoJSON objects. From the RFC:
A GeoJSON object represents a Geometry, Feature, or collection of Features...
- ...
- A GeoJSON object MAY have a "bbox" member, the value of which MUST be a bounding box array (see Section 5).
And whenever I'm in doubt of things like this, I often go to the implementation in another language like Rust, Go, TypeScript etc. to see what they did.
More concretely, our GeoJSON object is represented as a variant right now so either the type t
needs to change:
Or each of the Feature, Collection and Geometry need to change. I'm in favour of the former I think, let me know your thoughts :))
I think all three support it because they are all GeoJSON objects. From the RFC:
A GeoJSON object represents a Geometry, Feature, or collection of Features...
- ...
- A GeoJSON object MAY have a "bbox" member, the value of which MUST be a bounding box array (see Section 5).
And whenever I'm in doubt of things like this, I often go to the implementation in another language like Rust, Go, TypeScript etc. to see what they did.
- Rust: https://github.com/georust/geojson/blob/735afac5efbf40f90f7a2043d6dba2472bb9ef1c/src/geometry.rs#L230
- Go: https://github.com/paulmach/go.geojson/blob/f123f2c4b2a3260f45af649903d2fd27d561ca08/geometry.go#L27
- TypeScript: https://github.com/DefinitelyTyped/DefinitelyTyped/blob/7de5a3274b57a5602670134ef0a36fd4f17fa179/types/geojson/index.d.ts#L67
Oh... Thank you, Sir. Very useful tips on how to go about problems π
More concretely, our GeoJSON object is represented as a variant right now so either the
type t
needs to change:Or each of the Feature, Collection and Geometry need to change. I'm in favour of the former I think, let me know your thoughts :))
Maybe it will be better to go through the former approach which you suggested but I think it will be easier to follow the later approach. Changing each of those GeoJson types helps to break down the problem in pieces so that I can focus on one at a time. The biggest adavantage is that once I have a piece(A given type) working, I won't need to touch other parts of the codebase where it was used as it would tally with every other variant of its type. So this approach is more like, get it working at the base level, get it working everywhere. However, based on my observation(which of course might be wrong), if I make use of the former approach, I will have to keep going around the codebase to fix things where they got wrong and most importantly, there won't be a uniformity of types as the various variants of types won't tally with their base types.
I would like to hear more on this from you Sir as this would make me better at deciding what approach is optimal for solving a given problem.
I have made changes to each of the GeoJSON types and they seem to be not so bad. I'm currently working on the Random module as it got broken because of the changes I made. Should I push the changes I have made to the PR while I continue to work on the Random module?
Changing each of those GeoJson types helps to break down the problem in pieces so that I can focus on one at a time. The biggest advantage is that once I have a piece(A given type) working, I won't need to touch other parts of the codebase where it was used as it would tally with every other variant of its type. So this approach is more like, get it working at the base level, get it working everywhere.
I'm not sure I completely follow this line of reasoning but perhaps it'll become clearer when you push some code (pushing small self-contained commits is fine). My gut instinct is that there are shared concepts between the three kinds of GeoJSON object and so lifting those out of the kind itself into a shared type might reduce the amount of code. For example a function to extract bbox
, is that needed three times for Geometry
, Feature
and Feature.Collection
in your approach?
In my mind I had thought maybe something like the following would work:
type geojson =
| Feature of Feature.t
| FeatureCollection of Feature.Collection.t
| Geometry of Geometry.t
and t = { geojson : geojson; bbox : float array option }
I didn't think too hard about the names. But the changes needed elsewhere are not too significant, for example:
diff --git a/test/geojson/test.ml b/test/geojson/test.ml
index 09ce30c..a512047 100644
--- a/test/geojson/test.ml
+++ b/test/geojson/test.ml
@@ -50,8 +50,8 @@ let _get_all_props s =
let json = Ezjsonm.value_from_string s in
let geo = Geojson.of_json json in
match geo with
- | Ok (Feature f) -> Option.to_list @@ Geojson.Feature.properties f
- | Ok (FeatureCollection fc) ->
+ | Ok { geojson = Feature f; _ } -> Option.to_list @@ Geojson.Feature.properties f
+ | Ok { geojson = FeatureCollection fc; _ } ->
Do note that this is an unreleased library with no libraries depending on it so making big changes to the interface is completely fine if it is an improvement. In fact, this is the easiest time to do so :))
I think I should push my changes so that you can see what I did
Changing each of those GeoJson types helps to break down the problem in pieces so that I can focus on one at a time. The biggest advantage is that once I have a piece(A given type) working, I won't need to touch other parts of the codebase where it was used as it would tally with every other variant of its type. So this approach is more like, get it working at the base level, get it working everywhere.
I'm not sure I completely follow this line of reasoning but perhaps it'll become clearer when you push some code (pushing small self-contained commits is fine). My gut instinct is that there are shared concepts between the three kinds of GeoJSON object and so lifting those out of the kind itself into a shared type might reduce the amount of code. For example a function to extract
bbox
, is that needed three times forGeometry
,Feature
andFeature.Collection
in your approach?In my mind I had thought maybe something like the following would work:
type geojson = | Feature of Feature.t | FeatureCollection of Feature.Collection.t | Geometry of Geometry.t and t = { geojson : geojson; bbox : float array option }
I didn't think too hard about the names. But the changes needed elsewhere are not too significant, for example:
diff --git a/test/geojson/test.ml b/test/geojson/test.ml index 09ce30c..a512047 100644 --- a/test/geojson/test.ml +++ b/test/geojson/test.ml @@ -50,8 +50,8 @@ let _get_all_props s = let json = Ezjsonm.value_from_string s in let geo = Geojson.of_json json in match geo with - | Ok (Feature f) -> Option.to_list @@ Geojson.Feature.properties f - | Ok (FeatureCollection fc) -> + | Ok { geojson = Feature f; _ } -> Option.to_list @@ Geojson.Feature.properties f + | Ok { geojson = FeatureCollection fc; _ } ->
Do note that this is an unreleased library with no libraries depending on it so making big changes to the interface is completely fine if it is an improvement. In fact, this is the easiest time to do so :))
Your gut instinct is right Sir. The bbox function is above the Geometry module where it can be shared among Geometry, Feature and FeatureCollection.
I await your feedback Sir on the recent changes I made. I'm currently working on the Random module. I'm yet to fully grasp it but I will get there.
Glad you took your time to make things clear and provide a sound review. Thank you. I will work on the suggested changes as soon as possible.
Hi @patricoferris, good afternoon Sir. I'm sorry I could not communicate here yesterday. My system went bad and I had to quickly go to an engineer for repairs. I got it in good shape later in the evening :))
More concretely, our GeoJSON object is represented as a variant right now so either the
type t
needs to change: https://github.com/geocaml/ocaml-geojson/blob/9eaaa54fa433bf732069e3e4623f9fd29c38b754/src/geojson/geojson_intf.ml#L219-L222Or each of the Feature, Collection and Geometry need to change. I'm in favour of the former I think, let me know your thoughts :))
Maybe it will be better to go through the former approach which you suggested but I think it will be easier to follow the later approach. Changing each of those GeoJson types helps to break down the problem in pieces so that I can focus on one at a time. The biggest adavantage is that once I have a piece(A given type) working, I won't need to touch other parts of the codebase where it was used as it would tally with every other variant of its type. So this approach is more like, get it working at the base level, get it working everywhere. However, based on my observation(which of course might be wrong), if I make use of the former approach, I will have to keep going around the codebase to fix things where they got wrong and most importantly, there won't be a uniformity of types as the various variants of types won't tally with their base types.
I would like to hear more on this from you Sir as this would make me better at deciding what approach is optimal for solving a given problem.
I would like to start by saying that my line of reasoning here was entirely wrong. The problem got way easier when I took the approach you recommended. I did not take that approach at first because I did not really know how to go about it that way, but on analyzing the problem more keenly and getting my hands dirty, I realized that it was the best way. When I used the method that requires changing each of the Geojson types, the resulting breaks in the codes were so much on almost every end to the point that I almost touched the entire codebase. However, on using the approach you recommended, I needed to just make few changes on the geojson.ml file and then, follow it up on other files where appropriate.
I learned a lot from this issue and it changed the way I think on approaching problems and solving them. Thank you so much for the opportunity to work on this issue :))
Also, please do let me know what I would have done better and point out any bad software engineering practice I exhibited.
Thanks @streetCoderr.
I'm still trying to work out the best approach. The issue is that what would an end user assume
Geojson.Point.of_json
does? Should it read the bounding-box data? At the minute only the toplevelGeojson.of_json
function does. This is probably actually correct but then I don't think we should expose theJson.Conv
for all of the geometries because it is confusing as to what they do. Let me know your thoughts.It would also be good to add some tests for the bounding-box.
That's true Sir. An end user would expect GeoJson.Point.of_Json
to also read the bounding-box data because it is part of the Json.
What I think is that we should abstract away the base definition from the end user and make them use the top level definition.
Like you pointed out, this means I would have to hide Json.Conv
from all geometries. of_Json would then be polymorphic even from the end user's standpoint, in the sense that the user won't need to know which GeoJson type they are calling it on, yet, be guaranteed that it would produce the right result.
Please, I would like to know your take on this Sir.
If you agree with this, I can hide Json.Conv
where they don't matter.
Thank you as I await your response.
Nicely summarised, I think that sounds good to me. It'll become clearer when you get a chance to push a commit :))
Nicely summarised, I think that sounds good to me. It'll become clearer when you get a chance to push a commit :))
Thank you for the review Sir.
I have tried hiding Json.Conv
from all geometry types as we discussed.
However it resulted to a lot of breaks in the code.
Turns out that a huge part of the codebase from different files depend on that.
I'm currently trying to find another way around it. I will be back soon. Thank you
Hi sir @patricoferris,
To prevent the code from breaking and maintaining expected behaviour of the entire codebase, I thought of some way. I don't know if it would make sense to you.
How about we make it like this.
Let's have a function base_of_json
for each GeoJson type that will be exposed to the user.
What does this do? This function reads the very basic fields of a geojson object from a given json, i.e, it does not read any optional fields.
For instance,
Geojson.Point.base_of_json
will return the basic field of a Point which is its coordinates.
Geojson.Feature.base_of_json
will return the basic field of a Feature, which is its geometry and properties.
Why are we doing this? What is the use-case?
I feel there might be instances where a user might just want to get the coordinates of a given geometry. In that case there is no point reading the bounding-box. The user can therefore, call base_of_json on the given geometry.
If the user needs to read all fields from the json, they can call the top level's of_json
for that.
I don't know if this makes sense Sir, but I came up with this to prevent the codes from breaking as it was getting so tough to fix. Do let me know your thought sir. Thank you. I will push what I did now.
Please help me review this Sir. Thank you.
Hi @patricoferris, please help me review my code. I want to write the tests but you still haven't approved of this approach yet. I wanted to be sure you have approved of it before I begin writing the test.
Or should I go ahead with the test?
Also, please do point out where I got things wrong. Thank you Sir ππ
Thanks for the hard work @streetCoderr :))!
I think my comments aren't clear enough so I went ahead and built on top of your work in this commit https://github.com/geocaml/ocaml-geojson/commit/fb140881a6e16390be5070aa018a1f23e74c4bc9 to show what I mean by:
Json_conv
from the geometry types because I think it will too confusing to explain what the "fundamental" fields are for each one.bbox
into a float array option
instead of the float array option option
that is in this code currently.I'm hoping this will help clear things up, feel free to cherry-pick the commit if you agree with the approach. I'm also happy to hear your thoughts and suggestions if you think it isn't right too :)) !
Thanks for the hard work @streetCoderr :))!
I think my comments aren't clear enough so I went ahead and built on top of your work in this commit fb14088 to show what I mean by:
- Removing
Json_conv
from the geometry types because I think it will too confusing to explain what the "fundamental" fields are for each one.- And turning
bbox
into afloat array option
instead of thefloat array option option
that is in this code currently.I'm hoping this will help clear things up, feel free to cherry-pick the commit if you agree with the approach. I'm also happy to hear your thoughts and suggestions if you think it isn't right too :)) !
Good morning @patricoferris. Thank you for your positive feedback. And yes, this actually cleared things upππ.
I also did some extra 'massaging' to fix the errors that came up after making the suggested changes. Please help me check it out and let me know if it's okay. Thank you.
Excellent ! Few little nitpicks and then we can merge I think! Thank you for persisting with this PR :))
Thank you for patiently following up on this PR tooππ
Also, please, help me look into what I just pushed. I'm done with implementing the suggested changes.
Hi @patricoferris.
I abstracted away type t
from the user in the most recent commit.
Please check it out.
Thank you.
Hi @patricoferris. Please help me review. I'm done with the suggested changes
Great! If you could just undo the changes to the
index.js
file (I will fix that problem later) this is good to go.
Alright, I just did that now. Thanks
Hi @patricoferris; can you help me check why this is failing? It's running totally fine on my system.
Edit: I get now. It's because of the newly merged PRs. Anything I can do about this?
You will need to merge the changes in or rebase this PR onto the main
branch. This will require changing the new tests because you've changed the API in this PR.
You will need to merge the changes in or rebase this PR onto the
main
branch. This will require changing the new tests because you've changed the API in this PR.
I have made the required changes. Please review. Thank youππ
Hi @patricoferris. Just looking at this and I'm seeing that it has 30 commits which is a bit too much. What do you think about this? Should I squash the commits so that it won't mess up our main git history? I personally feel this will be nice. Please do let me know your take on this. Thank you.
Squashing some of it would be great, or I can squash it all into a single commit. It depends if you have a grouping of commits in mind or not :))
Squashing some of it would be great, or I can squash it all into a single commit. It depends if you have a grouping of commits in mind or not :))
I think I would rather leave it to you to do the squashing then. Thank youππ
Hi @patricoferris, please can you spare time to review this. Thanks as I await your response.
Process
I went through this file and came up with the implementation of the bbox function.
I want to ask though. What does it mean for a point to be the most southwesterly point. For instance, if we have 2 points: a)[-120, -20] and b)[-80, -77], which of them is the most southwesterly point. Now,
a
is more westerly thanb
butb
is more southerly thana
. So how do we decide which is more south-westerly. My approach was to pick the the most westerly coordinate between both and combine with the most southerly coordinate to make up a point; in this case, [-120,-77]. Is this the expected southwesterly point. I feel so though.