omniscale / imposm3

Imposm imports OpenStreetMap data into PostGIS
http://imposm.org/docs/imposm3/latest/
Apache License 2.0
726 stars 159 forks source link

Sanitize integers #21

Closed thomersch closed 10 years ago

thomersch commented 10 years ago

We encountered a problem with large integer values while importing building:levels as an int.

Examples:

building:levels=0 (ok)
building:levels=40 (ok)
building:levels=a (ok, value is being discarded)
building:levels=19082139812039812093908123 (ok, value is being discarded because it's bigger than 2^64)

So far, so good. But the following fails:

building:levels=1000000000000000000

The value is > 2^32 but < 2^64, so it is parsed successfully into cache, but it fails while being imported into Postgres, because the data field is a 32 bit integer.

Corresponding line in source: https://github.com/omniscale/imposm3/blob/master/mapping/fields.go#L23

I am not sure, which strategy is best a this point. One possibility is setting the int type as int64 in PostgreSQL, which would obviously lead to an increased memory consumption.

The second option is adding an additional, explicit int64, which would be 64 bit in the application as well as in PostgreSQL.

The third option (which I find best) is declaring it int32 in both imposm and PostgreSQL. Bigger values would be treated as invalid values.

olt commented 10 years ago

Thanks. https://github.com/omniscale/imposm3/blob/master/mapping/fields.go#L121 should parse the integer with 32 bit precision. There should be another type for larger ints (integer64?) to avoid increased memory usage as you mentioned.

BTW: What kind of error do you get?

thomersch commented 10 years ago

Thanks. https://github.com/omniscale/imposm3/blob/master/mapping/fields.go#L121 should parse the integer with 32 bit precision.

As far as I understand that should solve this issue.

There should be another type for larger ints (integer64?) to avoid increased memory usage as you mentioned.

Well, I am not sure, if there is a real use case for this, but it sounds reasonable and maybe some one is needing it.

BTW: What kind of error do you get?

I don't have the exact error message any more, but it was Postgres that was throwing something like "integer out of range".

olt commented 10 years ago

Fixed that first. Might add Integer64 later. Thanks for reporting.

thomersch commented 10 years ago

Very cool, thanks a lot :)

Just one more thing: Could you trigger a new build for http://imposm.org/static/rel/?