koopjs / koop-pgcache

PostGIS cache for Koop.
Other
4 stars 4 forks source link

_insertFeature cleanup #38

Closed kneemer closed 8 years ago

kneemer commented 8 years ago

I noticed two things that are not being used in _insertFeature

I have a branch that addresses these issues and can make a pull request but wanted to get feedback first as I am new to this repo. https://github.com/koopjs/koop-pgcache/compare/master...kneemer:insertfeature-cleanup

chelm commented 8 years ago

Looks great to me and my non-vested interests... nice catch!

dmfenton commented 8 years ago

I think these are good fixes. We do need to answer, where do we want to set feature ids? Right now they are set on the way out of the DB according to the row id. Valid geojson that gets stored in the DB should already have an ID in my opinion.

Your other point is valid too, though I'm not sure we should be setting the CRS in the cache at all. @chelm can you speak to any history there?

kneemer commented 8 years ago

This all came about because I wanted to be able to set the feature id for koop-geotrigger but the id only was saved in the feature json not as the row id. I was able to do it here: https://github.com/kneemer/koop-pgcache/commit/26ccd445f58ad402f4f54fecb9032f7d0559b764 but wasn't sure if that would mess up existing users who may have a feature id defined but were used to the row id being different. I can clean that up if it would be useful.

kneemer commented 8 years ago

Also, would we ever want to create a geohash for a GeometryCollection? I can create a separate issue if the answer is yes.

dmfenton commented 8 years ago

@kneemer I think the best way to go is to apply ids on the feature itself at insert time if they don't already exist in the geojson. That will have the added benefit of saving us a loop when we retrieve features. We'll need a little extra logic to be able to query by feature->'id' but that's not too big of a deal.

chelm commented 8 years ago

apply ids on the feature itself at insert time if they don't already exist in the geojson

@dmfenton How can you ensure uniqueness if features are being partially inserted in chunks, async?

kneemer commented 8 years ago

I agree with @chelm here, if you don't have a unique feature id in the geojson, it is safer to rely on the postgres sequence for uniqueness.

dmfenton commented 8 years ago

Good point. Options would be:

Not sure if eliminating the loop is worth the effort to any of those, however.

chelm commented 8 years ago

@dmfenton okay now im confused. What are we trying to solve here? If we rely on the DB to create IDs (which we do right now) then it can ensure uniqueness and all we win since we dont have to think about ids until we pull data out of the db.

kneemer commented 8 years ago

I can attempt a solution on the retrieval end instead, just look to see if the features json has an id and use that instead of the row id if it exists?

kneemer commented 8 years ago

I should probably give you my use case so you have an idea of what I'm trying to do: I'm trying to create a 1:1 relationship between geotrigger events and features. Each event has a unique id and I would like the feature id to be the same. I am updating features as new ones arrive so I'm using insertPartial to add the new features.

I am inserting features that have a defined id but it doesn't get saved as the row id so when I retrieve the feature the id I have set is overwritten with the row id.

dmfenton commented 8 years ago

@chelm @kneemer wants to be able to set the row id from the external source instead of creating an arbitrary id in the DB. Although, I'm not sure we need to do that either, when we can just set that id as a property.

dmfenton commented 8 years ago

@kneemer, why not just set that ID as a property?

kneemer commented 8 years ago

I could, I was just hoping I could have it top level since I know it is unique. It is not a show stopper for me.

dmfenton commented 8 years ago

I don't think the benefit is going to be worth the effort there.

kneemer commented 8 years ago

ok, now that I've gone all over the place in this issue :)
1) Should I put in a pull request for the cleanup above? @chelm did you have any feedback about whether we should be setting the CRS in the cache at all. 2) Do we want to support creating a geohash for a GeometryCollection in _insertFeature? If so, I'll create an issue.

chelm commented 8 years ago

I think we should remove the CRS stuff. Was added when I was doing some mapnik tinkering and then didnt get removed when the module was split out.

I'd say make an issue for the GeomCollection/Geohash so it's logged as a known issue. In case it ever comes up.