Closed pramsey closed 9 years ago
Interesting
I see the point about "pure" coordinate list and an n-points array.
But i am not convinced that it works. When reading those types, the client will always need information about how long a list is before it starts reading it, like the id-list. But maybe that is just a matter of putting ngeoms in front of id-list.
But if single id objects is mixed with multiple id objects coexist, how shall the client know how to combine geometries and id's. If it is 3 id's and 5 linestrings, something have to define what id belong to what linestring(s)
Am I missing something? Could you exemplify?
Last example about separating dimensions is also interesting. But why would it give better compression? I see the code will be cleaner in both backend and client because it just have to keep track of 1 absolute value. Now it keeps an array with 1 value per dimension.
One thing that could point against this is the handling of the coordinates both at back end and client. If it is huge geometries (like aggregating a bigger data set to gain from avoiding absolute coordinates between geometries) then it will be some effort to rearrange the coordinates from x,y,z,x,y,z
to
x,x,y,y,z,z
Yes, you're right, the ngeoms has to come before the id list
metadata_header byte
[size] varint
type_and_dims byte
[bounds] bbox
ngeoms varint
[ids] varint[]
npoints varint[]
x varint[]
y varint[]
[z] varint[]
As I see if you have two use cases:
Let me try to write out a generic geometry collection that handles both cases again
metadata_header byte
[size] varint
type_and_dims byte
[bounds] bbox
ngeoms varint
[ids] varint[]
geoms geom[]
For the "copy WKB" case, the ids are optional, so they are left out. For the "grouping" case, the ids come along for the ride. I'm not sure I see a use case where you can get a group that itself has an id, and all the components also have ids. What function would generate that kind of thing? An aggregation would bundle up a bunch of rows (that have ids) into a group that itself would not have an id (because it's brand new, it's synthetic).
Now i think I follow you. So what is changed is that the id's is in front of the geometries. Yes, why not? It would make it much faster if nothing else to scan if an id is present.
About a group of geometries with individual id's and a top level id I have thought about it. The use case would be to do aggregation to get some sort of tiles and to get a tile ID. The function would then be designed to pick the top level id from one of the "group by" fields. I don't know how to control that, but that would be the logic I guess.
I have no problems to leave that idea
Once you accept the ideas of [ids] being before geometries, we move on to the next level which is that it's possible to do away with the special "group" type altogether. Since now a multi-point-with-ids looks just like type 20, the "homogeneous group", right?
And we end up with two aggregation signatures: collect_twkb(geom, id) and collect_twkb(geom) (ignore for a moment that I'm making up the SQL function names and not looking at what you already named them)
Yep, I follow
On Tue, 2015-04-28 at 13:10 -0700, Paul Ramsey wrote:
Once you accept the ideas of [ids] being before geometries, we move on to the next level which is that it's possible to do away with the special "group" type altogether. Since now a multi-point-with-ids looks just like type 20, the "homogeneous group", right?
— Reply to this email directly or view it on GitHub.
We did this
I have to admit, I find the whole "basic types" and "group types" thing galling and unnecessary. I see the proximate desire for them, to squeeze out unneeded metadata, but I'm sure it can be done without all these extra types. For example, imagine this MultiPoint:
See what I did there? By taking the per-point id and pulling it up front into an array, I am having cake and eating it too. Similar things work for multilinetrings...
Since you're already committing to deserializing every coordinate in the multilinestring, by virtue of having only one absolute coordinate in the whole object, you can pack all the structure information up front, and leave the coordinate list "pure", like this:
Going even further in this vein, you can get to
Which, if nothing else, would have lovely compression characteristics.
But I'm getting off-topic. The point is if the ids get pulled up into an optional idlist, you can have multiple id objects that coexist with single id objects.